The Simplismart Python SDK provides programmatic access to manage model repositories, deployments, and secrets.
Installation
Install the SDK using pip:
pip install simplismart-sdk
Authentication
The SDK uses Playground (PG) token authentication. You can obtain your token from the Simplismart Playground interface.
- Open Simplismart Settings → API Key.
- Copy the Playground Token and set it as
SIMPLISMART_PG_TOKEN in your .env file or environment.
Environment Variables
Configure authentication using environment variables:
export SIMPLISMART_PG_TOKEN="your_pg_token_here"
export ORG_ID="your_org_uuid"
export SIMPLISMART_BASE_URL="https://api.app.simplismart.ai" # Optional, default: https://api.app.simplismart.ai
export SIMPLISMART_TIMEOUT="300" # Optional, default: 300 seconds
Client Initialization
Simplismart
from simplismart import Simplismart
client = Simplismart(
pg_token="your_pg_token", # Optional if SIMPLISMART_PG_TOKEN is set
base_url="https://api.app.simplismart.ai", # Optional, default
timeout=300, # Optional, default: 300 seconds
)
| Parameter | Type | Description |
|---|
pg_token | str | None | Playground token. Falls back to SIMPLISMART_PG_TOKEN env var. |
base_url | str | API base URL. Default: https://api.app.simplismart.ai |
timeout | float | Request timeout in seconds. Default: 300 |
Quickstart Example
The following end-to-end example demonstrates key MLOps operations using the Simplismart SDK, including listing model repositories, registering your own container-based model, deploying it on Simplismart infrastructure, and performing clean-up. This workflow highlights how Simplismart enables seamless management of the entire MLOps lifecycle programmatically.
from simplismart import (
DeploymentCreate,
ModelRepoCreate,
ModelRepoListParams,
Simplismart
)
# Initialize client
client = Simplismart(
pg_token="your_pg_token",
base_url="https://api.app.simplismart.ai"
)
# 1) List model repos
repos = client.list_model_repos(
ModelRepoListParams(
org_id="org-uuid",
offset=0,
count=5,
status="SUCCESS"
)
)
# 2) Create container model repo
repo = client.create_model_repo(
ModelRepoCreate(
name="vision-container-demo",
org_id="org-uuid",
source_type="docker_hub",
source_secret="secret-uuid",
registry_path="your-docker-org/your-model",
docker_tag="latest",
runtime_gpus=1
)
)
# 3) Create private deployment
deployment = client.create_private_deployment(
DeploymentCreate(
model_repo=repo["uuid"],
org="org-uuid",
gpu_id="nvidia-h100",
name="vision-private-deploy",
min_pod_replicas=1,
max_pod_replicas=2,
autoscale_config={
"targets": [{"metric": "gpu", "target": 80}]
}
)
)
# 4) List deployments
deployments = client.list_deployments(status="DEPLOYED", offset=0, count=20)
# 5) Cleanup
client.delete_deployment(deployment_id=deployment["uuid"], org_id="org-uuid")
client.delete_model_repo(model_id=repo["uuid"])
For more examples, see the examples/quickstart.py file in the SDK repository.
Model Repositories
Manage model repositories using the client.model_repos attribute or convenience methods.
list_model_repos
Lists model repositories with optional filtering.
from simplismart import ModelRepoListParams, Simplismart
client = Simplismart()
repos = client.list_model_repos(
ModelRepoListParams(
org_id="org-uuid",
offset=0,
count=5,
status="SUCCESS",
name="vision",
model_type="BYOM",
created_by="user@example.com"
)
)
ModelRepoListParams
| Parameter | Type | Description | Options |
|---|
org_id | str | None | Organization UUID | - |
offset | int | Pagination offset (default: 0) | ≥ 0 |
count | int | Page size (default: 5, max: 20) | 0-20 |
model_id | str | None | Filter by specific model repo UUID | - |
name | str | None | Filter by name (contains match) | - |
status | str | None | Filter by status | SUCCESS, FAILED, DELETED, PROGRESSING |
model_type | str | None | Filter by model type | - |
created_by | str | None | Filter by creator email | - |
get_model_repo
Gets a specific model repository by ID.
repo = client.get_model_repo(
model_id="model-repo-uuid",
org_id="org-uuid" # Optional
)
create_model_repo
Bring your own container-based models from Docker Hub, Depot or NVIDIA NGC registry.
from simplismart import ModelRepoCreate, Simplismart
client = Simplismart()
repo = client.create_model_repo(
ModelRepoCreate(
name="vision-container-demo",
org_id="org-uuid",
source_type="docker_hub",
runtime_gpus=1,
source_secret="secret-uuid",
registry_path="your-docker-org/your-model",
docker_tag="latest",
env={"EXAMPLE_KEY": "value"},
healthcheck={"path": "/", "port": 8000},
ports={"http": {"port": 8000}},
metrics_path=["/v1/chat/completions"],
deployment_custom_configuration={"command": ["python", "-m", "server"]}
)
)
ModelRepoCreate
| Parameter | Type | Description | Required |
|---|
name | str | Model repo name (1-255 chars) | Yes |
org_id | str | Organization UUID | Yes |
source_type | str | Registry source type. Options: docker_hub, depot, nvidiadockersecret | Yes |
runtime_gpus | int | Number of GPUs (≥ 0; typically 0 or 1 for BYOM) | Yes |
source_secret | str | None | Secret UUID for registry authentication | Conditional* |
registry_path | str | None | Registry path/repo name (max 255) | Conditional* |
docker_tag | str | None | Image tag (max 255) | Conditional* |
env | dict | None | Environment variables | No |
healthcheck | dict | None | Health check configuration | No |
ports | dict | None | Port mappings | No |
metrics_path | list | None | List of metrics paths | No |
deployment_custom_configuration | dict | list | None | Custom deployment configuration | No |
*Required when source_type is docker_hub, depot, or nvidiadockersecret.
Source Type Options
| Value | Description |
|---|
docker_hub | Docker Hub registry |
depot | Depot registry |
nvidiadockersecret | NVIDIA NGC registry |
create_model_repo_private_compile
Creates a private compile model repository: the platform compiles the model from a source (e.g. Hugging Face) using your model config, optimisation config, and pipeline config.
import json
from simplismart import (
ModelRepoCompileAvatar,
ModelRepoCompileCreate,
Simplismart,
)
client = Simplismart()
# Load configs (e.g. from examples/private-compile-sample/)
with open("model_config.json") as f:
model_config = json.load(f)
with open("optimisation_config.json") as f:
optimisation_config = json.load(f)
with open("pipeline_config.json") as f:
pipeline_config = json.load(f)
repo = client.create_model_repo_private_compile(
ModelRepoCompileCreate(
name="my-llama-repo",
description="Llama model - private compile",
avatar=ModelRepoCompileAvatar(
image_url="https://ui-avatars.com/api/?name=llama"
),
source_type="huggingface",
source_url="meta-llama/Llama-3.2-1B-Instruct",
model_class="LlamaForCausalLM",
accelerator_type="nvidia-h100",
accelerator_count=0,
cloud_account="your-cloud-account-uuid",
model_config_data=model_config,
optimisation_config=optimisation_config,
pipeline_config=pipeline_config,
)
)
ModelRepoCompileCreate (key fields)
| Parameter | Type | Description | Required |
|---|
name | str | Model repo name | Yes |
avatar | ModelRepoCompileAvatar | Avatar image URL (and optional colors) | Yes |
source_type | str | Source type, e.g. huggingface | Yes |
source_url | str | Source path/URL (e.g. HF repo id) | Yes |
model_class | str | Model class (e.g. LlamaForCausalLM) | Yes |
accelerator_type | str | Accelerator type (e.g. nvidia-h100) | Yes |
model_config_data | dict | None | Model config JSON (see below) | No |
optimisation_config | dict | None | Optimisation config JSON (see below) | No |
pipeline_config | dict | None | Pipeline config JSON (see below) | No |
org_id | str | None | Organization UUID (optional if from token) | No |
accelerator_count | int | None | Accelerator count (default: 0) | No |
cloud_account | str | None | Cloud account UUID | No |
source_secret | str | None | Secret UUID for source access | No |
description | str | None | Description | No |
To understadn what each of these parameters mean, check out this guide on model optimisation
Config files (private compile)
Example configs are in the SDK repo under examples/private-compile-sample/:
model_config.json — Model architecture and tokenizer options (e.g. architectures, hidden_size, max_position_embeddings, torch_dtype). Must match the model you are compiling.
Example (Llama-style):
{
"architectures": ["LlamaForCausalLM"],
"hidden_size": 2048,
"intermediate_size": 8192,
"max_position_embeddings": 131072,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 16,
"num_key_value_heads": 8,
"torch_dtype": "bfloat16",
"vocab_size": 128256
}
optimisation_config.json — Backend, warmups, and optimisations (e.g. quantization, tensor_parallel_size, optimisations.dit_optimisation, backend).
Example
{
"model_type": "llm",
"quantization": "float16",
"tensor_parallel_size": 1,
"warmups": { "enabled": true, "iterations": 5, "sample_input_data": [] },
"backend": { "name": "auto", "version": "latest" },
"optimisations": {
"dit_optimisation": {
"enabled": true,
"attention_backend": { "type": "auto" },
"compilation": { "enabled": false, "mode": "auto", "fullgraph": true, "dynamic": true }
}
}
}
pipeline_config.json — Pipeline type and options (e.g. type, loras, quantized_model_path, enable_model_caching, mode).
{
"type": "llm",
"loras": [],
"lora_repo": { "type": "", "path": "", "ownership": "", "secret": { "type": "" } },
"quantized_model_path": { "type": "", "path": "", "ownership": "", "secret": { "type": "" } },
"extra_params": {},
"enable_model_caching": true,
"mode": "chat"
}
delete_model_repo
Deletes a model repository.
result = client.delete_model_repo(model_id="model-repo-uuid")
# Returns: True on success
Deployments
Manage deployments using the client.deployments attribute or convenience methods.
create_private_deployment
Creates a private deployment for a model repo.
from simplismart import DeploymentCreate, Simplismart
client = Simplismart()
deployment = client.create_private_deployment(
DeploymentCreate(
model_repo="model-repo-uuid",
org="org-uuid",
gpu_id="nvidia-h100",
name="vision-private-deploy",
min_pod_replicas=1,
max_pod_replicas=2,
autoscale_config={
"targets": [
{"metric": "gpu", "target": 80}
]
},
env_variables={"KEY": "value"},
healthcheck={"path": "/", "port": 8000},
ports={"http": {"port": 8000}},
metrics_path=["/v1/chat/completions"],
fast_scaleup=True,
deployment_tag="v1.0"
)
)
DeploymentCreate
| Parameter | Type | Description | Required |
|---|
model_repo | str | Model repository UUID | Yes |
org | str | Organization UUID | Yes |
gpu_id | str | GPU type identifier. Examples: nvidia-h100, nvidia-a10, nvidia-l4 | Yes |
name | str | Deployment name (3-60 chars) | Yes |
min_pod_replicas | int | Minimum pod replicas (≥ 1) | Yes |
max_pod_replicas | int | Maximum pod replicas (≥ 1) | Yes |
autoscale_config | AutoscaleConfig | Autoscaling configuration | Yes |
env_variables | dict | None | Environment variables | No |
deployment_custom_configuration | dict | None | Custom deployment config | No |
healthcheck | dict | None | Health check configuration | No |
ports | dict | None | Port mappings | No |
metrics_path | list | None | Metrics paths | No |
persistent_volume_claims | dict | list | None | PVC configurations | No |
fast_scaleup | bool | None | Enable fast scale up | No |
deployment_tag | str | None | Deployment tag label | No |
AutoscaleConfig
autoscale_config = {
"targets": [
{
"metric": "gpu", # Required
"target": 80, # Required (number)
"percentile": 95 # Optional, only for latency metric
}
]
}
| Metric Option | Description |
|---|
concurrency | Number of concurrent requests |
cpu | CPU utilization percentage |
gpu | GPU utilization percentage |
gram | GPU memory utilization |
latency | Request latency (supports percentiles 50, 75, 90, 95) |
ram | RAM utilization |
throughput | Requests per second |
The percentile field is only supported when metric is set to latency.
create_deployment
deployment = client.create_deployment(DeploymentCreate(...))
list_deployments
Lists deployments with optional filtering.
deployments = client.list_deployments(
model_repo_id="model-repo-uuid", # Optional
status="DEPLOYED", # Optional
offset=0, # Optional, default: 0
count=20 # Optional, default: 20
)
Deployment Status Options
| Value | Description |
|---|
DEPLOYED | Deployment is running |
PENDING | Deployment is being created |
FAILED | Deployment failed |
STOPPED | Deployment is stopped |
DELETED | Deployment has been deleted |
list_model_deployments
Lists all model deployments for an organization.
deployments = client.list_model_deployments(org_id="org-uuid")
get_model_deployment
Gets deployment details by ID.
deployment = client.get_model_deployment(deployment_id="deployment-uuid")
get_deployment
Get deployment details
deployment = client.get_deployment(deployment_id="deployment-uuid")
update_deployment
Updates deployment configuration.
updated = client.update_deployment(
deployment_id="deployment-uuid",
payload={
"min_pod_replicas": 1,
"max_pod_replicas": 2,
"autoscale_config": {
"targets": [{"metric": "gpu", "target": 80}]
}
}
)
stop_deployment
Stops a running deployment.
result = client.stop_deployment(
deployment_id="deployment-uuid",
)
start_deployment
Starts a stopped deployment.
result = client.start_deployment(
deployment_id="deployment-uuid",
)
restart_deployment
Restarts a deployment.
result = client.restart_deployment(
deployment_id="deployment-uuid",
org_id="org-uuid",
namespace="namespace"
)
fetch_deployment_health
Gets deployment health status.
health = client.fetch_deployment_health(deployment_id="deployment-uuid")
update_deployment_autoscaling
Updates deployment autoscaling configuration.
result = client.update_deployment_autoscaling(
deployment_id="deployment-uuid",
min_replicas=1,
max_replicas=3
)
delete_deployment
Deletes a deployment.
result = client.delete_deployment(
deployment_id="deployment-uuid",
org_id="org-uuid" # Optional; omit if org is derived from token
)
CLI equivalent:
simplismart deployments delete --deployment-id <DEPLOYMENT_UUID>
simplismart deployments delete --deployment-id <DEPLOYMENT_UUID> --org-id <ORG_ID>
delete_model_deployment
Delete a deployment
result = client.delete_model_deployment(deployment_id="deployment-uuid")
Secrets
Manage Docker registry credentials and other secrets.
create_secret
Creates a secret for an organization. The data payload depends on secret_type:
secret_type | data payload |
|---|
| docker_hub | {"username": "...", "token": "..."} |
| depot | {"username": "...", "token": "..."} |
| NVIDIA NIM | {"server": "nvcr.io", "username": "$oauthtoken", "password": "..."} |
from simplismart import SecretCreate, Simplismart
client = Simplismart()
# Docker Hub
secret = client.create_secret(
SecretCreate(
org="org-uuid",
name="registry-secret",
secret_type="docker_hub",
data={
"username": "your-username",
"token": "your-token",
},
)
)
# Depot (same data shape as Docker Hub)
secret = client.create_secret(
SecretCreate(
org="org-uuid",
name="depot-registry-secret",
secret_type="depot",
data={
"username": "your-username",
"token": "your-token",
},
)
)
# NVIDIA NIM
secret = client.create_secret(
SecretCreate(
org="org-uuid",
name="nvidia-secret",
secret_type="nvidia_nim",
data={
"server": "nvcr.io",
"username": "$oauthtoken",
"password": "your-oauth-token",
},
)
)
SecretCreate
| Parameter | Type | Description | Required |
|---|
org | str | Organization UUID | Yes |
name | str | Secret name (1-255 chars) | Yes |
secret_type | str | Secret type. Options: docker_hub, depot, nvidia_nim | Yes |
data | dict | Secret data; shape depends on secret_type (see table above) | Yes |
list_secrets
Lists secrets for an organization.
secrets = client.list_secrets(org_id="org-uuid")
get_secret
Gets a specific secret by ID.
secret = client.get_secret(secret_id="secret-uuid")
Error Handling
The SDK raises SimplismartError for all API errors.
from simplismart import Simplismart, SimplismartError
client = Simplismart()
try:
deployment = client.get_deployment(deployment_id="invalid-uuid")
except SimplismartError as e:
print(f"Status: {e.status_code}")
print(f"Message: {e}")
print(f"Payload: {e.payload}")
SimplismartError Attributes
| Attribute | Type | Description |
|---|
status_code | int | HTTP status code |
payload | dict | Full error response payload |
message | str | Error message from backend |