Overview - Simplismart

The Simplismart Python SDK provides programmatic access to manage model repositories, deployments, and secrets.

Installation

Install the SDK using pip:

pip install simplismart-sdk

Authentication

The SDK uses Playground (PG) token authentication. You can obtain your token from the Simplismart Playground interface.

Open Simplismart Settings → API Key.
Copy the Playground Token and set it as SIMPLISMART_PG_TOKEN in your .env file or environment.

Environment Variables

Configure authentication using environment variables:

export SIMPLISMART_PG_TOKEN="your_pg_token_here"
export ORG_ID="your_org_uuid"
export SIMPLISMART_BASE_URL="https://api.app.simplismart.ai"  # Optional, default: https://api.app.simplismart.ai
export SIMPLISMART_TIMEOUT="300"  # Optional, default: 300 seconds

Client Initialization

import os
from dotenv import load_dotenv
load_dotenv()

from simplismart import Simplismart

# Token and optional settings from env: SIMPLISMART_PG_TOKEN, SIMPLISMART_BASE_URL, SIMPLISMART_TIMEOUT
client = Simplismart(
    pg_token=os.getenv("SIMPLISMART_PG_TOKEN"),
    base_url=os.getenv("SIMPLISMART_BASE_URL", "https://api.app.simplismart.ai"),
    timeout=int(os.getenv("SIMPLISMART_TIMEOUT", "300")),
)

Parameter	Type	Description
`pg_token`	`str \| None`	Playground token. Falls back to `SIMPLISMART_PG_TOKEN` env var.
`base_url`	`str`	API base URL. Default: `https://api.app.simplismart.ai`
`timeout`	`float`	Request timeout in seconds. Default: `300`

Quickstart Example

This end-to-end example covers the full MLOps lifecycle: compiling a model from Hugging Face, polling until it’s ready, creating a deployment, and checking its health. First of all, set necessary environment variables in your .env (see Environment Variables).

import os
from dotenv import load_dotenv
from simplismart import Simplismart, ModelRepoCompileCreate, ModelRepoListParams, DeploymentCreate
load_dotenv()
from time import sleep

client = Simplismart(pg_token=os.getenv("SIMPLISMART_PG_TOKEN"))
org_id = os.getenv("ORG_ID")
MODEL_REPO_NAME = "llama-3.2-1b-instruct-SDK"

# model compile
payload = ModelRepoCompileCreate(
    name=MODEL_REPO_NAME,
    description="llama-model - A model deployed using Simplismart",    
    source_type="huggingface",
    source_url="meta-llama/Llama-3.2-1B-Instruct",
    model_class="LlamaForCausalLM",
    accelerator_type="nvidia-h100",
    use_simplismart_infrastructure=True,
)

data = client.create_model_repo_private_compile(payload)
print(
    f"Model compilation initiated: {data['name']} | "
    f"uuid={data['uuid']} | status={data['status']} | source={data['source_url']}"
)

# Fetch the compiled model repo and wait until it's ready
list_params = ModelRepoListParams(org_id=org_id, offset=0, count=1, name=MODEL_REPO_NAME)
model_repo_id = None
prev_status = None

while True:
    repos = client.list_model_repos(list_params)
    result = repos["results"][0]
    model_repo_id = result["uuid"]
    status = result["status"]

    if status != prev_status:
        print(f"Model Repo {model_repo_id}: {status}")
        prev_status = status

    if status == "SUCCESS":
        break
    sleep(10)

# create deployment
deployment = client.create_deployment(
    DeploymentCreate(
        org=org_id,
        model_repo=model_repo_id,
        gpu_id="nvidia-h100",
        name="llama-3.2-1b-instruct-SDK",  # should be unique
        min_pod_replicas=1,
        max_pod_replicas=2,
        autoscale_config={"targets": [{"metric": "gpu", "target": 80}]},
    )
)

deployment_id = deployment["deployment_id"]
model_endpoint = deployment.get("model_endpoint", "")
print(
    f"Deployment created: id={deployment_id} \n Name={deployment.get('name')} \n "
    f"Model Endpoint=https://{model_endpoint}"
)

deployment_detail = client.get_model_deployment(
    deployment_id=os.getenv("DEPLOYMENT_ID", deployment_id)
)
print(f"Status: {deployment_detail.get('status', 'unknown')}")

health = client.fetch_deployment_health(deployment_id=deployment_id)
health_status = health.get("data", "unknown")
if health.get("messages"):
    msg = health["messages"][0].get("message", "")
    print(f"Health: {health_status} — {msg}")
else:
    print(f"Health: {health_status}")

if health_status == "Healthy":
    print("Deployment is ready.")
else:
    print("Deployment is still in progress.")

Expected Output

Model compilation initiated: llama-3.2-1b-instruct-SDK | uuid=e079d5f9-9fa9-4664-82ce-0218b7d1c220 | status=PENDING | source=meta-llama/Llama-3.2-1B-Instruct
Model Repo e079d5f9-9fa9-4664-82ce-0218b7d1c220: PENDING
Model Repo e079d5f9-9fa9-4664-82ce-0218b7d1c220: LAUNCHING_RAY_CLUSTER
Model Repo e079d5f9-9fa9-4664-82ce-0218b7d1c220: OPTIMISING
Model Repo e079d5f9-9fa9-4664-82ce-0218b7d1c220: SUCCESS
Deployment created: id=0ee77f95-a49a-4965-9aa9-311fa9318c47 
 Name=llama-3.2-1b-instruct-SDK 
 Model Endpoint=https://YOUR-ENDPOINT.HERE
Status: DEPLOYING
Health: Progressing — The deployment is progressing. Please wait for the application to be healthy.
Deployment is still in progress.

Error Handling

The SDK raises SimplismartError for all API errors.

from simplismart import Simplismart, SimplismartError

client = Simplismart()
try:
    deployment = client.get_deployment(deployment_id="00000000-0000-0000-0000-000000000000")
except SimplismartError as e:
    print("Status:", e.status_code)
    print("Message:", e)
    print("Payload:", e.payload)

Expected output (for invalid or missing deployment):

Caught SimplismartError:
  status_code: 404
  message: No ModelDeployment matches the given query. (status=404)
  payload: {'detail': 'No ModelDeployment matches the given query.'}

SimplismartError Attributes

Attribute	Type	Description
`status_code`	`int`	HTTP status code
`payload`	`dict`	Full error response payload
`message`	`str`	Error message from backend

​Installation

​Authentication

​Environment Variables

​Client Initialization

​Quickstart Example

​Error Handling

​SimplismartError Attributes