Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.simplismart.ai/llms.txt

Use this file to discover all available pages before exploring further.

Track GPU compute consumption and cost across plan types using the client.get_usage_stats() method.

Cost & Usage

get_usage_stats

Fetches time-series usage and cost data for a given plan type and time range.
from datetime import datetime, timedelta, timezone
from simplismart import Simplismart, UsageStatsParams
import json

client = Simplismart()

now = datetime.now(tz=timezone.utc)

stats = client.get_usage_stats(
    UsageStatsParams(
        plan_type="private",
        start_time=(now - timedelta(days=7)).isoformat(),
        end_time=now.isoformat(),
        window_size="DAY",
    )
)

print(json.dumps(stats, indent=2, default=str))
Expected Output:
[
  {
    "feature_id": "feat_01JW63P4YOUR-FEATURE-ID-HERE",
    "source": "pg-llama3p1-8b_369ff438-YOUR-DEPLOYMENT-ID-HERE",
    "event_name": "2-X-YOUR-EVENT-NAME-HERE",
    "total_usage": "120",
    "total_cost": "8.00000000000004",
    "currency": "usd",
    "unit_price": {
      "amount": "0.066666666666667",
      "currency": "usd"
    },
    "event_count": 60,
    "points": [
      {
        "timestamp": "2026-05-05T16:00:00Z",
        "usage": "8",
        "cost": "0.533333333333336",
        "computed_commitment_utilized_amount": "0",
        "computed_overage_amount": "0",
        "computed_true_up_amount": "0",
        "event_count": 4
      },
      {
        "timestamp": "2026-05-05T17:00:00Z",
        "usage": "112",
        "cost": "7.466666666666704",
        "computed_commitment_utilized_amount": "0",
        "computed_overage_amount": "0",
        "computed_true_up_amount": "0",
        "event_count": 56
      }
    ]
  }
]

UsageStatsParams

ParameterTypeDescriptionRequired
plan_typePlanTypeCompute plan to query. See Plan TypesYes
start_timestrRange start in ISO 8601 format (e.g. 2026-04-01T00:00:00+00:00)Yes
end_timestrRange end in ISO 8601 formatYes
window_sizeWindowSizeAggregation bucket size. Options are: MINUTE, 15MIN, 30MIN, HOUR, 3HOUR, 6HOUR, 12HOUR, DAY, WEEKYes
workspace_idstr | NoneRestrict to a specific workspace UUID. Uses the org default if omittedNo
deployment_idslist[str] | NoneFilter by deployment UUID(s). Only valid for private, byocNo
deployment_slugslist[str] | NoneFilter by deployment slug(s). Only valid for private, byocNo
model_nameslist[str] | NoneFilter by model name(s) (e.g. DeepSeek-R1). Only valid for sharedNo
training_job_idslist[str] | NoneFilter by training job UUID(s). Only valid for trainingNo
training_job_nameslist[str] | NoneFilter by training job name(s). Only valid for trainingNo
model_repo_idslist[str] | NoneFilter by model repo UUID(s). Only valid for compilationNo
model_repo_nameslist[str] | NoneFilter by model repo name(s). Only valid for compilationNo
include_all_statusesboolInclude all deployment statuses (SUCCESS, STOPPED, DELETED, FAILED, etc.). Default: only SUCCESS and STOPPED. Not supported for training and reservedNo
List parameters accept one or more values (for example ["a", "b"]). Passing a filter that is not valid for the selected plan_type raises a ValidationError.
  1. You can find workspace-id under Settings > Workspaces. Select your workspace and copy the workspace ID.
  1. Go to Deployments and select a deployment to find the deployment-id and deployment-slug.

Plan Types

ValueDescription
sharedShared endpoint usage
privatePrivate/dedicated deployment usage
byocBring Your Own Compute deployment usage
reservedReserved capacity usage
trainingTraining and fine-tuning job usage
compilationModel compilation job usage

Examples

Dedicated Deployment usage (plan_type="private")
stats = client.get_usage_stats(
    UsageStatsParams(
        plan_type="private",
        start_time=(now - timedelta(days=14)).isoformat(),
        end_time=now.isoformat(),
        window_size="DAY",
    )
)
Shared endpoint (plan_type="shared"): hourly buckets for the last 48 hours, pin results to one workspace_id
stats = client.get_usage_stats(
    UsageStatsParams(
        plan_type="shared",
        start_time=(now - timedelta(days=2)).isoformat(),
        end_time=now.isoformat(),
        window_size="HOUR",
        workspace_id="your-workspace-uuid",
    )
)
Dedicated/BYOC: daily cost only for chosen deployments, pass deployment_ids (use deployment_slugs instead when you have slugs, not UUIDs)
stats = client.get_usage_stats(
    UsageStatsParams(
        plan_type="private",
        start_time=(now - timedelta(days=30)).isoformat(),
        end_time=now.isoformat(),
        window_size="DAY",
        deployment_ids=["uuid-1", "uuid-2"],
    )
)
Shared endpoint: daily cost only for listed model_names
stats = client.get_usage_stats(
    UsageStatsParams(
        plan_type="shared",
        start_time=(now - timedelta(days=30)).isoformat(),
        end_time=now.isoformat(),
        window_size="DAY",
        model_names=["DeepSeek-R1", "Llama-3"],
    )
)
Training (plan_type="training"): weekly buckets for the last 90 days, narrow to training_job_names
stats = client.get_usage_stats(
    UsageStatsParams(
        plan_type="training",
        start_time=(now - timedelta(days=90)).isoformat(),
        end_time=now.isoformat(),
        window_size="WEEK",
        training_job_names=["finetune-llama-v1", "finetune-llama-v2"],
    )
)
Compilation (plan_type="compilation"): daily cost for specific model_repo_names
stats = client.get_usage_stats(
    UsageStatsParams(
        plan_type="compilation",
        start_time=(now - timedelta(days=30)).isoformat(),
        end_time=now.isoformat(),
        window_size="DAY",
        model_repo_names=["my-llama-repo", "my-mistral-repo"],
    )
)
BYOC: daily rollup for one deployment slug, including non-success deployment statuses (include_all_statuses=True)
stats = client.get_usage_stats(
    UsageStatsParams(
        plan_type="byoc",
        start_time=(now - timedelta(days=30)).isoformat(),
        end_time=now.isoformat(),
        window_size="DAY",
        deployment_slugs=["my-deploy"],
        include_all_statuses=True,
    )
)

Error Handling

The SDK raises SimplismartError for API errors. Pydantic validates plan_type and window_size before the request is sent, so invalid values are caught locally.
from simplismart import Simplismart, UsageStatsParams
from simplismart.exceptions import SimplismartError

client = Simplismart()

try:
    stats = client.get_usage_stats(
        UsageStatsParams(
            plan_type="private",
            start_time="2026-04-01T00:00:00+00:00",
            end_time="2026-04-30T23:59:59+00:00",
            window_size="DAY",
        )
    )
except SimplismartError as e:
    print("Status:", e.status_code)
    print("Message:", e)
    print("Payload:", e.payload)

SimplismartError Attributes

AttributeTypeDescription
status_codeintHTTP status code
payloaddictFull error response payload
messagestrError message from the backend