Skip to main content
Quotas define usage limits for resources available to your organisation. They help ensure fair allocation of infrastructure and prevent excessive usage beyond permitted limits. A quota is a limit on the amount of a specific resource your organisation can use at a given time. Quotas help control:
  • Infrastructure usage
  • Concurrent workloads
  • Resource allocation across teams

GPU Quotas

GPU quotas determine the maximum GPU resources your organisation can use simultaneously.

Where is it Applicable?

GPU quotas apply to the following platform operations:
  • Training: GPU usage during model training jobs counts toward the quota.
  • Compilation: Some model compilation processes require GPU resources and will consume quota.
  • Private Deployment: For private deployments, GPU quotas apply to the maximum replicas allowed, not the desired replicas configured in the deployment. This ensures that scaling operations remain within the allowed GPU limits.

Default Quotas

By default, every new organization receives:
  • 1 × H100 GPUs
  • 1 × L40 GPUs

Automatic Quota Release

Quotas are automatically released once:
  • A training or compilation job completes successfully, or
  • A job fails and is no longer active.

Global/Region Specific Quotas

By default, quotas are configured at a global level and apply across all regions and workloads. These limits control the total amount of resources that can be used by the organisation, regardless of where deployments are running. While the platform supports configuring quotas at a more granular level (such as region-specific or job-specific quotas), this is not enabled by default. In certain cases, custom quotas may be assigned for a specific region and/or job based on requirements.
If you want to deploy models in a certain region, please reach out to us at support@simplismart.ai.

Other Quotas

Depending on the platform configuration, additional quotas may apply to:
  • Training jobs
  • Compilation tasks
  • Deployment replicas
  • Other compute resources

Managing Low Quota Issues

If a workload cannot start due to quota limits, you may see errors indicating that the requested resources exceed the available quota. To resolve this:
  1. Review your current resource usage.
  2. Stop unused deployments or workloads.
  3. Reduce the requested compute configuration.
  4. Request a quota increase if additional capacity is required.

Requesting a Quota Increase

If your organisation requires higher resource limits, you can request a quota increase. Quota increases are typically requested when:
  • Scaling production deployments.
  • Running larger training jobs.
  • Expanding workloads across multiple regions.
To request a quota increase, contact the Simplismart team with details of the resource type, region, and expected usage requirements.