Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.simplismart.ai/llms.txt

Use this file to discover all available pages before exploring further.

The Simplismart Python SDK provides programmatic access to manage model repositories, deployments, and secrets.

Installation

Install the SDK using pip:
pip install simplismart-sdk

Authentication

The SDK uses Playground (PG) token authentication. You can obtain your token from the Simplismart Playground interface.
  1. Open Simplismart SettingsAPI Key.
  2. Copy the Playground Token and set it as SIMPLISMART_PG_TOKEN in your .env file or environment.

Environment Variables

Configure authentication using environment variables:
export SIMPLISMART_PG_TOKEN="your_pg_token_here"
export ORG_ID="your_org_uuid"
export SIMPLISMART_BASE_URL="https://api.app.simplismart.ai"  # Optional, default: https://api.app.simplismart.ai
export SIMPLISMART_TIMEOUT="300"  # Optional, default: 300 seconds

Client Initialization

import os
from dotenv import load_dotenv
load_dotenv()

from simplismart import Simplismart

# Token and optional settings from env: SIMPLISMART_PG_TOKEN, SIMPLISMART_BASE_URL, SIMPLISMART_TIMEOUT
client = Simplismart(
    pg_token=os.getenv("SIMPLISMART_PG_TOKEN"),
    base_url=os.getenv("SIMPLISMART_BASE_URL", "https://api.app.simplismart.ai"),
    timeout=int(os.getenv("SIMPLISMART_TIMEOUT", "300")),
)
ParameterTypeDescription
pg_tokenstr | NonePlayground token. Falls back to SIMPLISMART_PG_TOKEN env var.
base_urlstrAPI base URL. Default: https://api.app.simplismart.ai
timeoutfloatRequest timeout in seconds. Default: 300

Quickstart Example

This end-to-end example covers the full MLOps lifecycle: compiling a model from Hugging Face, polling until it’s ready, creating a deployment, and checking its health. First of all, set necessary environment variables in your .env (see Environment Variables).
import os
from dotenv import load_dotenv
from simplismart import Simplismart, ModelRepoCompileCreate, ModelRepoListParams, DeploymentCreate
load_dotenv()
from time import sleep

client = Simplismart(pg_token=os.getenv("SIMPLISMART_PG_TOKEN"))
org_id = os.getenv("ORG_ID")
MODEL_REPO_NAME = "llama-3.2-1b-instruct-SDK"

# model compile
payload = ModelRepoCompileCreate(
    name=MODEL_REPO_NAME,
    description="llama-model - A model deployed using Simplismart",    
    source_type="huggingface",
    source_url="meta-llama/Llama-3.2-1B-Instruct",
    model_class="LlamaForCausalLM",
    accelerator_type="nvidia-h100",
    use_simplismart_infrastructure=True,
)

data = client.create_model_repo_private_compile(payload)
print(
    f"Model compilation initiated: {data['name']} | "
    f"uuid={data['uuid']} | status={data['status']} | source={data['source_url']}"
)

# Fetch the compiled model repo and wait until it's ready
list_params = ModelRepoListParams(org_id=org_id, offset=0, count=1, name=MODEL_REPO_NAME)
model_repo_id = None
prev_status = None

while True:
    repos = client.list_model_repos(list_params)
    result = repos["results"][0]
    model_repo_id = result["uuid"]
    status = result["status"]

    if status != prev_status:
        print(f"Model Repo {model_repo_id}: {status}")
        prev_status = status

    if status == "SUCCESS":
        break
    sleep(10)

# create deployment
deployment = client.create_deployment(
    DeploymentCreate(
        org=org_id,
        model_repo=model_repo_id,
        gpu_id="nvidia-h100",
        name="llama-3.2-1b-instruct-SDK",  # should be unique
        min_pod_replicas=1,
        max_pod_replicas=2,
        autoscale_config={"targets": [{"metric": "gpu", "target": 80}]},
    )
)

deployment_id = deployment["deployment_id"]
model_endpoint = deployment.get("model_endpoint", "")
print(
    f"Deployment created: id={deployment_id} \n Name={deployment.get('name')} \n "
    f"Model Endpoint=https://{model_endpoint}"
)

deployment_detail = client.get_model_deployment(
    deployment_id=os.getenv("DEPLOYMENT_ID", deployment_id)
)
print(f"Status: {deployment_detail.get('status', 'unknown')}")

health = client.fetch_deployment_health(deployment_id=deployment_id)
health_status = health.get("data", "unknown")
if health.get("messages"):
    msg = health["messages"][0].get("message", "")
    print(f"Health: {health_status}{msg}")
else:
    print(f"Health: {health_status}")

if health_status == "Healthy":
    print("Deployment is ready.")
else:
    print("Deployment is still in progress.")
Expected Output
Model compilation initiated: llama-3.2-1b-instruct-SDK | uuid=e079d5f9-9fa9-4664-82ce-0218b7d1c220 | status=PENDING | source=meta-llama/Llama-3.2-1B-Instruct
Model Repo e079d5f9-9fa9-4664-82ce-0218b7d1c220: PENDING
Model Repo e079d5f9-9fa9-4664-82ce-0218b7d1c220: LAUNCHING_RAY_CLUSTER
Model Repo e079d5f9-9fa9-4664-82ce-0218b7d1c220: OPTIMISING
Model Repo e079d5f9-9fa9-4664-82ce-0218b7d1c220: SUCCESS
Deployment created: id=0ee77f95-a49a-4965-9aa9-311fa9318c47 
 Name=llama-3.2-1b-instruct-SDK 
 Model Endpoint=https://YOUR-ENDPOINT.HERE
Status: DEPLOYING
Health: Progressing — The deployment is progressing. Please wait for the application to be healthy.
Deployment is still in progress.

Error Handling

The SDK raises SimplismartError for all API errors.
from simplismart import Simplismart, SimplismartError

client = Simplismart()
try:
    deployment = client.get_deployment(deployment_id="00000000-0000-0000-0000-000000000000")
except SimplismartError as e:
    print("Status:", e.status_code)
    print("Message:", e)
    print("Payload:", e.payload)
Expected output (for invalid or missing deployment):
Caught SimplismartError:
  status_code: 404
  message: No ModelDeployment matches the given query. (status=404)
  payload: {'detail': 'No ModelDeployment matches the given query.'}

SimplismartError Attributes

AttributeTypeDescription
status_codeintHTTP status code
payloaddictFull error response payload
messagestrError message from backend