Manage deployments using theDocumentation Index
Fetch the complete documentation index at: https://docs.simplismart.ai/llms.txt
Use this file to discover all available pages before exploring further.
client.deployments attribute or convenience methods.
create_deployment
Creates a deployment for a model repo.
Use env for model repo UUID and organization ID (e.g.
ORG_ID); do not hardcode secrets.DeploymentCreate
| Parameter | Type | Description | Required |
|---|---|---|---|
model_repo | str | Model repository UUID | Yes |
org | str | Organization UUID (org_id) | Yes |
gpu_id | str | GPU type identifier. Examples: nvidia-h100, nvidia-a10, nvidia-l4 | Yes |
name | str | Deployment name (3-60 chars) | Yes |
min_pod_replicas | int | Minimum pod replicas (≥ 1) | Yes |
max_pod_replicas | int | Maximum pod replicas (≥ 1) | Yes |
autoscale_config | AutoscaleConfig | Autoscaling configuration | Yes |
env_variables | dict | None | Environment variables | No |
deployment_custom_configuration | dict | None | Custom deployment config | No |
healthcheck | dict | None | Health check configuration | No |
ports | dict | None | Port mappings | No |
metrics_path | list | None | Metrics paths | No |
persistent_volume_claims | dict | list | None | PVC configurations | No |
fast_scaleup | bool | None | Enable fast scale up | No |
deployment_tag | str | None | Deployment tag label | No |
AutoscaleConfig
| Metric Option | Description |
|---|---|
concurrency | Number of concurrent requests |
cpu | CPU utilization percentage |
gpu | GPU utilization percentage |
gram | GPU memory utilization |
latency | Request latency (supports percentiles 50, 75, 90, 95) |
ram | RAM utilization |
throughput | Requests per second |
The
percentile field is only supported when metric is set to latency.list_deployments
Lists deployments with optional filtering.
Deployment Status Options
| Value | Description |
|---|---|
DEPLOYED | Deployment is running |
PENDING | Deployment is being created |
FAILED | Deployment failed |
STOPPED | Deployment is stopped |
DELETED | Deployment has been deleted |
list_model_deployments
Lists all model deployments for an organization.
get_model_deployment
Gets deployment details by ID. Set DEPLOYMENT_ID in env or use an id from list_deployments.
uuid, name, status, model_repo, org, autoscale_config, healthcheck, ports, min_pod_replicas, max_pod_replicas, etc.
get_deployment
Get deployment details by ID.
update_deployment
Updates deployment configuration.
stop_deployment
Stops a running deployment.
start_deployment
Starts a stopped deployment.
restart_deployment
Restarts a deployment.
fetch_deployment_health
Gets deployment health status.
update_deployment_autoscaling
Updates deployment autoscaling configuration.
delete_deployment
Deletes a deployment.
Error Handling
The SDK raisesSimplismartError for all API errors.
SimplismartError Attributes
| Attribute | Type | Description |
|---|---|---|
status_code | int | HTTP status code |
payload | dict | Full error response payload |
message | str | Error message from backend |