Initiate New Deployment

  • From the main menu, select the Deployments section.
  • Click on the Create button to start a new deployment.
    • Enter Deployment Name: Provide a unique name for your deployment.
    • Select Processing Type: Choose the appropriate processing type for your deployment based on your workload requirements. (Sync/ Async/ Real-time Async)
    • Select Cluster & Node group: Choose the cluster & node group created earlier where you wish to deploy your model.
    • Select Model: Choose the model you want to deploy from the list.


Resource Details

  • Choose the appropriate accelerator limits for your deployment.

Adding Scaling Metrics

  • Specify the scaling metrics that will be used to auto-scale your deployment.
  • Set the threshold values for each metric to trigger scaling actions.

Volume Mounting (Optional)

Volume mounting allows seamless data access between your storage and deployment.

If needed, you can configure volume mounting by specifying the required details under Read Volume Mount and Write Volume Mount fields. Follow these steps:

  • Provide the Source Address – the location of the volume.
  • Specify the Source Path – the path within the source where data is stored.
  • Define the Destination Path – where the volume will be mounted in the container.

Deploy

  • Click on the Add Deployment button to initiate the deployment process.
  • Check the right part of the screen to see the creation status of your deployment.
  • Monitor the deployment status to know when the model is ready for usage.
  • The status will show deployed once done. Your model is now ready for use.