Skip to main content
  1. Navigate to My Models and click on Add a Model button on the top-right. My models
  2. Enter Model Details
    • Provide a Model Name.
    • Select Model Source: Choose between Hugging Face or AWS/GCP bucket.
    • Enter the Model Path.
    • Select the Linked Cloud Credentials, if uploading from AWS or GCP.
      Upload your trained model to AWS S3 or GCP GCS, share the access credentials, and we’ll take care of compiling and preparing it for deployment. Models built to your specifications are seamlessly integrated into our platform.
    image.png
  3. Configure Model Class
    • Under Model Class, choose Custom Pipeline.
  4. Select Infrastructure
    • Choose Simplismart Cloud for optimized deployment.
    • Select a GPU type based on your model’s size and compute requirements.
    • The Machine Type can be left as the default Simplismart configuration.
  5. Pipeline Configuration (Optional)
  •  After selecting the infrastructure, you can configure the pipeline further based on your model’s runtime needs.
  • Use the Pipeline Config Editor to customize key deployment parameters.
  • Available fields include:
    • workers_per_device:
      Number of parallel workers assigned. (greater value, greater inference speed)
      Type: Int
      Default value : 1
    • device:
      Specifies the hardware (CPU, GPU) used for model inference.
      Type: string
      Default value : cpu

      Possible Values:
      • "cpu" — For CPU hardware
      • "cuda" — For GPU hardware
    • endpoint:
      User specified URl for model to receive and process requests.
      Type: String
      Default value : /predict
    • type:
      Model type used by the compilation engine to create configuration.
      Type: string
      Default: Needs to be specified
      Possible values :
      • custom — for custom serving feature
      • whisper — for Whisper serving
      • llm — for LLM serving
      • sd — for Stable Diffusion serving
      Recommended setting: use "custom" for serving your own custom models
    • Pipeline Configuration Example:
      {
          "type": "custom",
          "extra_params": {
              "workers_per_device": 2,
              "device": "cuda",
          "endpoint": "/predict"
          }
      }
      
    • These fields can be entered under the Extra Params section in key-value format.
  1. Model Configuration
Based on the model, update the model configuration. Example Configuration:
  • model.py:
    class Model:
      def __init__(self):
        self.model_ = None
      def load(self):
        self.model = "Model Initialization"
      def preprocess(self, request): # Can be any pydantic base model, dict, string
        # Do some preprocessing
        # ....
        return request
      def predict(self, request):
        # Do prediction
        # ....
        output = self.model.predict()
        return output
      def postprocess(self, request):
        # Do postprocesing here
        # ....
        return request
    
  • config.yaml:
    python_version: "3.10"
    environment_variables: {}
    requirements:
      - accelerate==0.20.3
      - bitsandbytes==0.39.1
      - peft==0.3.0
      - protobuf==4.23.3
      - sentencepiece==0.1.99
      - torch==2.0.1
      - transformers==4.30.2
    system_packages:
      - wget
      - curl
    custom_setup_script: "script.sh"
    
  1. Add the Model
Once all configurations are complete, click Add Model to start the model compilation.