Documentation Index
Fetch the complete documentation index at: https://docs.simplismart.ai/llms.txt
Use this file to discover all available pages before exploring further.
The Simplismart Python SDK provides programmatic access to manage model repositories, deployments, and secrets.
Installation
Install the SDK using pip:
pip install simplismart-sdk
Authentication
The SDK uses Playground (PG) token authentication. You can obtain your token from the Simplismart Playground interface.
- Open Simplismart Settings → API Key.
- Copy the Playground Token and set it as
SIMPLISMART_PG_TOKEN in your .env file or environment.
Environment Variables
Configure authentication using environment variables:
export SIMPLISMART_PG_TOKEN="your_pg_token_here"
export ORG_ID="your_org_uuid"
export SIMPLISMART_BASE_URL="https://api.app.simplismart.ai" # Optional, default: https://api.app.simplismart.ai
export SIMPLISMART_TIMEOUT="300" # Optional, default: 300 seconds
Client Initialization
import os
from dotenv import load_dotenv
load_dotenv()
from simplismart import Simplismart
# Token and optional settings from env: SIMPLISMART_PG_TOKEN, SIMPLISMART_BASE_URL, SIMPLISMART_TIMEOUT
client = Simplismart(
pg_token=os.getenv("SIMPLISMART_PG_TOKEN"),
base_url=os.getenv("SIMPLISMART_BASE_URL", "https://api.app.simplismart.ai"),
timeout=int(os.getenv("SIMPLISMART_TIMEOUT", "300")),
)
| Parameter | Type | Description |
|---|
pg_token | str | None | Playground token. Falls back to SIMPLISMART_PG_TOKEN env var. |
base_url | str | API base URL. Default: https://api.app.simplismart.ai |
timeout | float | Request timeout in seconds. Default: 300 |
Quickstart Example
This end-to-end example covers the full MLOps lifecycle: compiling a model from Hugging Face, polling until it’s ready, creating a deployment, and checking its health.
First of all, set necessary environment variables in your .env (see Environment Variables).
import os
from dotenv import load_dotenv
from simplismart import Simplismart, ModelRepoCompileCreate, ModelRepoListParams, DeploymentCreate
load_dotenv()
from time import sleep
client = Simplismart(pg_token=os.getenv("SIMPLISMART_PG_TOKEN"))
org_id = os.getenv("ORG_ID")
MODEL_REPO_NAME = "llama-3.2-1b-instruct-SDK"
# model compile
payload = ModelRepoCompileCreate(
name=MODEL_REPO_NAME,
description="llama-model - A model deployed using Simplismart",
source_type="huggingface",
source_url="meta-llama/Llama-3.2-1B-Instruct",
model_class="LlamaForCausalLM",
accelerator_type="nvidia-h100",
use_simplismart_infrastructure=True,
)
data = client.create_model_repo_private_compile(payload)
print(
f"Model compilation initiated: {data['name']} | "
f"uuid={data['uuid']} | status={data['status']} | source={data['source_url']}"
)
# Fetch the compiled model repo and wait until it's ready
list_params = ModelRepoListParams(org_id=org_id, offset=0, count=1, name=MODEL_REPO_NAME)
model_repo_id = None
prev_status = None
while True:
repos = client.list_model_repos(list_params)
result = repos["results"][0]
model_repo_id = result["uuid"]
status = result["status"]
if status != prev_status:
print(f"Model Repo {model_repo_id}: {status}")
prev_status = status
if status == "SUCCESS":
break
sleep(10)
# create deployment
deployment = client.create_deployment(
DeploymentCreate(
org=org_id,
model_repo=model_repo_id,
gpu_id="nvidia-h100",
name="llama-3.2-1b-instruct-SDK", # should be unique
min_pod_replicas=1,
max_pod_replicas=2,
autoscale_config={"targets": [{"metric": "gpu", "target": 80}]},
)
)
deployment_id = deployment["deployment_id"]
model_endpoint = deployment.get("model_endpoint", "")
print(
f"Deployment created: id={deployment_id} \n Name={deployment.get('name')} \n "
f"Model Endpoint=https://{model_endpoint}"
)
deployment_detail = client.get_model_deployment(
deployment_id=os.getenv("DEPLOYMENT_ID", deployment_id)
)
print(f"Status: {deployment_detail.get('status', 'unknown')}")
health = client.fetch_deployment_health(deployment_id=deployment_id)
health_status = health.get("data", "unknown")
if health.get("messages"):
msg = health["messages"][0].get("message", "")
print(f"Health: {health_status} — {msg}")
else:
print(f"Health: {health_status}")
if health_status == "Healthy":
print("Deployment is ready.")
else:
print("Deployment is still in progress.")
Expected Output
Model compilation initiated: llama-3.2-1b-instruct-SDK | uuid=e079d5f9-9fa9-4664-82ce-0218b7d1c220 | status=PENDING | source=meta-llama/Llama-3.2-1B-Instruct
Model Repo e079d5f9-9fa9-4664-82ce-0218b7d1c220: PENDING
Model Repo e079d5f9-9fa9-4664-82ce-0218b7d1c220: LAUNCHING_RAY_CLUSTER
Model Repo e079d5f9-9fa9-4664-82ce-0218b7d1c220: OPTIMISING
Model Repo e079d5f9-9fa9-4664-82ce-0218b7d1c220: SUCCESS
Deployment created: id=0ee77f95-a49a-4965-9aa9-311fa9318c47
Name=llama-3.2-1b-instruct-SDK
Model Endpoint=https://YOUR-ENDPOINT.HERE
Status: DEPLOYING
Health: Progressing — The deployment is progressing. Please wait for the application to be healthy.
Deployment is still in progress.
Error Handling
The SDK raises SimplismartError for all API errors.
from simplismart import Simplismart, SimplismartError
client = Simplismart()
try:
deployment = client.get_deployment(deployment_id="00000000-0000-0000-0000-000000000000")
except SimplismartError as e:
print("Status:", e.status_code)
print("Message:", e)
print("Payload:", e.payload)
Expected output (for invalid or missing deployment):
Caught SimplismartError:
status_code: 404
message: No ModelDeployment matches the given query. (status=404)
payload: {'detail': 'No ModelDeployment matches the given query.'}
SimplismartError Attributes
| Attribute | Type | Description |
|---|
status_code | int | HTTP status code |
payload | dict | Full error response payload |
message | str | Error message from backend |