Model Repositories

Manage model repositories using the client.model_repos attribute or convenience methods.

`list_model_repos`

Lists model repositories with optional filtering. Run with SIMPLISMART_PG_TOKEN set as an environment variable (e.g. in .env).

import os
from dotenv import load_dotenv
load_dotenv()

from simplismart import ModelRepoListParams, Simplismart

client = Simplismart()
repos = client.list_model_repos(
    ModelRepoListParams(
        offset=0,
        count=5,
        status="SUCCESS",
        name="vision",           # optional filter
        model_type="BYOM",       # optional filter
        created_by="user@example.com",  # optional filter
    )
)
print(repos)

Expected output

{
  "limit": 5,
  "offset": 0,
  "count": 10,
  "results": [
    {
      "uuid": "model-repo-uuid",
      "name": "whisper-nemo-diarization",
      "source_type": "docker_hub",
      "source_url": "simplismart/MODEL-NAME:latest",
      "is_byom": true,
      "accelerator": null,
      "runtime_gpus": 1,
      "byom": {
        "image": "simplismart/MODEL-NAME:latest",
        "registry": "simplismart/REGISTRY-NAME",
        "tag": "latest"
      },
      "secrets": {
        "source_secret": {
          "uuid": "secret-uuid",
          "name": "SECRET-NAME"
        }
      },
      "status": "SUCCESS",
      "model_type": "byom",
      "env": {},
      "created_at": "2026-03-02T11:52:16.925151Z",
      "updated_at": "2026-03-02T11:52:16.925162Z",
      "org_id": "org-uuid",
      "healthcheck": {
        "path": "/health",
        "port": 8000,
        "periodSeconds": 10,
        "timeoutSeconds": 5,
        "initialDelaySeconds": 30
      },
      "ports": { "http": { "port": 8000 } },
      "metrics_path": [],
      "deployment_custom_configuration": { "command": [] }
    }
  ]
}

ModelRepoListParams

Parameter	Type	Description	Options
`offset`	`int`	Pagination offset (default: 0)	≥ 0
`count`	`int`	Page size (default: 5, max: 20)	0-20
`model_id`	`str \| None`	Filter by specific model repo UUID	-
`name`	`str \| None`	Filter by name (contains match)	-
`status`	`str \| None`	Filter by status	`SUCCESS`, `FAILED`, `DELETED`, `PROGRESSING`
`model_type`	`str \| None`	Filter by model type	-
`created_by`	`str \| None`	Filter by creator email	-

`get_model_repo`

Gets a specific model repository by ID. Set MODEL_REPO_ID in env, or use a UUID from list_model_repos.

import os
from dotenv import load_dotenv
load_dotenv()

from simplismart import Simplismart

client = Simplismart()
repo = client.get_model_repo(
    model_id=os.getenv("MODEL_REPO_ID", "model-repo-uuid"),
)
print(repo)

Expected output

{
  "uuid": "f58265ce-cfc4-4d32-8b4f-848f06c5e181",
  "name": "Tp-8-mdhv-llama",
  "source_type": "docker_hub",
  "source_url": "madhavbohra09/llama-3.2-1b:latest",
  "is_byom": true,
  "accelerator": null,
  "runtime_gpus": 8,
  "byom": {
    "image": "madhavbohra09/llama-3.2-1b:latest",
    "registry": "madhavbohra09/llama-3.2-1b",
    "tag": "latest"
  },
  "secrets": {
    "source_secret": {
      "uuid": "0b8af8a8-d149-4262-b5b4-804ee6b98311",
      "name": "docker creds"
    }
  },
  "status": "SUCCESS",
  "model_type": "byom",
  "env": {},
  "created_at": "2026-03-03T13:14:28.714649Z",
  "updated_at": "2026-03-03T13:14:28.714660Z",
  "org_id": "0bf00b43-430a-4ca3-a8b3-b13cc8dc6d4f",
  "is_public": false,
  "healthcheck": {
    "path": "/health",
    "port": 8000,
    "periodSeconds": 10,
    "timeoutSeconds": 5,
    "initialDelaySeconds": 30
  },
  "ports": {
    "http": {
      "port": 8000
    }
  },
  "metrics_path": [],
  "deployment_custom_configuration": {
    "command": []
  }
}

`create_model_repo`

Bring your own container-based models from Docker Hub, Depot or NVIDIA NGC registry. Use enviroment vars for credentials (e.g. SOURCE_SECRET_ID); do not hardcode secrets.

import os
from dotenv import load_dotenv
load_dotenv()

from simplismart import ModelRepoCreate, Simplismart

client = Simplismart()

repo = client.create_model_repo(
    ModelRepoCreate(
        name="vision-container-demo",
        source_type="docker_hub",
        runtime_gpus=1,
        source_secret=os.getenv("SOURCE_SECRET_ID"),
        registry_path=os.getenv("REGISTRY_PATH", "your-docker-org/your-model"),
        docker_tag=os.getenv("DOCKER_TAG", "latest"),
        env={"EXAMPLE_KEY": "value"},
        healthcheck={"path": "/", "port": 8000},
        ports={"http": {"port": 8000}},
        metrics_path=["/v1/chat/completions"],
        deployment_custom_configuration={"command": ["python", "-m", "server"]},
    )
)

ModelRepoCreate

Parameter	Type	Description	Required
`name`	`str`	Model repo name (1-255 chars)	Yes
`source_type`	`str`	Registry source type. Options: `docker_hub`, `depot`, `nvidiadockersecret`	Yes
`runtime_gpus`	`int`	Number of GPUs (≥ 0; typically 0 or 1 for BYOM)	Yes
`source_secret`	`str \| None`	Secret UUID for registry authentication	Conditional*
`registry_path`	`str \| None`	Registry path/repo name (max 255)	Conditional*
`docker_tag`	`str \| None`	Image tag (max 255)	Conditional*
`env`	`dict \| None`	Environment variables	No
`healthcheck`	`dict \| None`	Health check configuration	No
`ports`	`dict \| None`	Port mappings	No
`metrics_path`	`list \| None`	List of metrics paths	No
`deployment_custom_configuration`	`dict \| list \| None`	Custom deployment configuration	No

*Required when source_type is docker_hub, depot, or nvidiadockersecret.

Source Type Options

Value	Description
`docker_hub`	Docker Hub registry
`depot`	Depot registry
`nvidiadockersecret`	NVIDIA NGC registry

Expected output

{'uuid': 'e20dd190-b0ed-4db5-a471-a580696b99bd', 'name': 'TEST-vision-container-demo', 'source_type': 'docker_hub', 'source_url': 'madhavbohra09/llama-3.2-1b:latest', 'is_byom': True, 'accelerator': None, 'runtime_gpus': 1, 'byom': {'image': 'madhavbohra09/llama-3.2-1b:latest', 'registry': 'madhavbohra09/llama-3.2-1b', 'tag': 'latest'}, 'secrets': {'source_secret': {'uuid': '0b8af8a8-d149-4262-b5b4-804ee6b98311', 'name': 'docker creds'}}, 'status': 'SUCCESS', 'model_type': 'byom', 'env': {'EXAMPLE_KEY': 'value'}, 'created_at': '2026-03-03T15:34:50.901581Z', 'updated_at': '2026-03-03T15:34:50.901592Z', 'org_id': '0bf00b43-430a-4ca3-a8b3-b13cc8dc6d4f', 'is_public': False, 'healthcheck': {'path': '/', 'port': 8000, 'initialDelaySeconds': 30, 'periodSeconds': 10, 'timeoutSeconds': 5}, 'ports': {'http': {'port': 8000}}, 'metrics_path': [], 'deployment_custom_configuration': {'command': []}}

`create_model_repo_private_compile`

Creates a private compile model repository: the platform compiles the model from a source (e.g. Hugging Face) using your model config, optimisation config, and pipeline config.

import json
from simplismart import (
    ModelRepoCompileCreate,
    Simplismart,
)

client = Simplismart()

# Load configs (e.g. from examples/private-compile-sample/)

def load_json_file(path):
    with open(path, "r") as f:
        return json.load(f)


repo = client.create_model_repo_private_compile(
    ModelRepoCompileCreate(
        name="my-llama-repo",
        description="Llama model - private compile",        
        source_type="huggingface",
        source_url="meta-llama/Llama-3.2-1B-Instruct",
        model_class="LlamaForCausalLM",
        accelerator_type="nvidia-h100",
        accelerator_count=0,
        cloud_account="your-cloud-account-uuid",
        model_config_data=load_json_file("model_config.json"),
        optimisation_config=load_json_file("optimisation_config.json"),
        pipeline_config=load_json_file("pipeline_config.json"),
    )
)

ModelRepoCompileCreate

Parameter	Type	Description	Required
`name`	`str`	Model repo name	Yes
`source_type`	`str`	Source type, e.g. `huggingface`	Yes
`source_url`	`str`	Source path/URL (e.g. HF repo id)	Yes
`mode`	`str`	Compilation mode (default: `public_hf`). e.g. `public_hf`, `private_hf`, `aws`, `gcp`, `public_url`, `simplismart`	Yes
`model_class`	`str`	Model class (e.g. `LlamaForCausalLM`)	Yes
`accelerator_type`	`str`	Accelerator type (e.g. `nvidia-h100`)	Yes
`org_id`	`str \| None`	Organization UUID (alias: `org`); optional if inferred from token	No
`accelerator_count`	`int \| None`	Accelerator count (default: 0)	No
`cloud_account`	`str \| None`	Cloud account UUID	No
`source_secret`	`str \| None`	Secret UUID for source access	No
`lora_secret`	`str \| None`	LoRA secret UUID	No
`model_config_data`	`dict \| None`	Model config JSON (alias: `model_config`); see below	No
`optimisation_config`	`dict \| None`	Optimisation config JSON (see below)	No
`pipeline_config`	`dict \| None`	Pipeline config JSON (see below)	No
`env`	`dict \| None`	Environment variables	No
`output_metadata`	`dict \| None`	Output metadata	No
`additional_details`	`dict \| None`	Additional details	No
`tags`	`dict \| None`	Tags object	No
`tasks`	`list \| None`	List of tasks	No
`model_family`	`str \| None`	Model family	No
`description`	`str \| None`	Description	No
`short_description`	`str \| None`	Short description	No
`dropdown_description`	`str \| None`	Dropdown description	No
`processing_mode`	`str \| None`	One of: `SYNC`, `ASYNC`, `REALTIME_ASYNC`	No
`machine_type`	`str \| None`	Machine type	No
`region`	`str \| None`	Region	No
`resource_group`	`str \| None`	Resource group	No
`use_simplismart_infrastructure`	`bool \| None`	Use Simplismart infrastructure	No

Config files (private compile)

Example configs are in the SDK repo under examples/private-compile-sample/:

model_config.json — Model architecture and tokenizer options (e.g. architectures, hidden_size, max_position_embeddings, torch_dtype). Must match the model you are compiling.

Example (Llama-style):

{
  "architectures": ["LlamaForCausalLM"],
  "hidden_size": 2048,
  "intermediate_size": 8192,
  "max_position_embeddings": 131072,
  "model_type": "llama",
  "num_attention_heads": 32,
  "num_hidden_layers": 16,
  "num_key_value_heads": 8,
  "torch_dtype": "bfloat16",
  "vocab_size": 128256
}

optimisation_config.json — Backend, warmups, and optimisations (e.g. quantization, tensor_parallel_size, optimisations.dit_optimisation, backend). Example

{
  "model_type": "llm",
  "quantization": "float16",
  "tensor_parallel_size": 1,
  "warmups": { "enabled": true, "iterations": 5, "sample_input_data": [] },
  "backend": { "name": "auto", "version": "latest" },
  "optimisations": {
    "dit_optimisation": {
      "enabled": true,
      "attention_backend": { "type": "auto" },
      "compilation": { "enabled": false, "mode": "auto", "fullgraph": true, "dynamic": true }
    }
  }
}

pipeline_config.json — Pipeline type and options (e.g. type, loras, quantized_model_path, enable_model_caching, mode).

{
  "type": "llm",
  "loras": [],
  "lora_repo": { "type": "", "path": "", "ownership": "", "secret": { "type": "" } },
  "quantized_model_path": { "type": "", "path": "", "ownership": "", "secret": { "type": "" } },
  "extra_params": {},
  "enable_model_caching": true,
  "mode": "chat"
}

For a datailed example, checkout this code snippet in Python: Full example: simplismart-python/examples/private-compile-sample/.

`delete_model_repo`

Deletes a model repository.

result = client.delete_model_repo(model_id=os.getenv("MODEL_REPO_ID", "model-repo-uuid"))
# Returns: True on success

Python SDK Reference

CLI Reference

`list_model_repos`

ModelRepoListParams

`get_model_repo`

`create_model_repo`

ModelRepoCreate

Source Type Options

`create_model_repo_private_compile`

ModelRepoCompileCreate

Config files (private compile)

`delete_model_repo`

Python SDK Reference

CLI Reference

Documentation Index

​list_model_repos

​ModelRepoListParams

​get_model_repo

​create_model_repo

​ModelRepoCreate

​Source Type Options

​create_model_repo_private_compile

​ModelRepoCompileCreate

​Config files (private compile)

​delete_model_repo

`list_model_repos`

ModelRepoListParams

`get_model_repo`

`create_model_repo`

ModelRepoCreate

Source Type Options

`create_model_repo_private_compile`

ModelRepoCompileCreate

Config files (private compile)

`delete_model_repo`