Skip to main content
The Edit Deployment feature allows you to adjust parameters such as scaling, models, or tags without needing to redeploy from scratch.
Deployment edits are applied as rolling updates to minimize downtime while changes are being applied.

Accessing the Edit Feature

Any successfully deployed model can be edited:
1

Navigate to Your Deployment

  1. Go to Deployments from the left sidebar
  2. Select the deployment you want to modify
  3. You’ll be taken to the deployment details view
2

Start Editing

Click the Edit button located in the top-right corner of the deployment details page.Edit deployment button
3

Apply Changes

Update the parameters and click on Apply Changes to implement the changes.\Edit deployment parameters interface

What Can Be Edited

Understanding which parameters are editable helps you plan deployment updates effectively.

Editable Parameters

The following parameters can be modified after deployment:
All scaling configurations can be updated:
  • Scaling Range: Adjust minimum and maximum instance counts
  • Scaling Metrics: Add, remove, or modify scaling triggers
  • Threshold Values: Change the values that trigger auto-scaling
Update scaling parameters based on observed traffic patterns to optimize performance and costs.
You can swap the deployed model with important constraints:
  • Can change: Different models of the same type
  • Cannot change: Model type (e.g., LLM to STT)
Example:
  • ✅ Swap Llama 3.1 8B with Llama 3.1 70B (both LLMs)
  • ❌ Swap Llama 3.1 8B with Whisper V3 (different types)
Changing models may require adjustments to your application code if input/output formats differ.
Tags can be freely added, modified, or removed:
  • Add new key-value pairs for organization
  • Update existing tag values
  • Remove obsolete tags
Common tag use cases:
  • Environment identification (env: production)
  • Version tracking (version: v2.1)
  • Cost allocation (team: ml-engineering)

Non-Editable Parameters

The following parameters are locked after deployment creation and cannot be changed:
  • Deployment Name: The unique identifier for your deployment
  • Cloud / Cluster: The infrastructure where the deployment runs
  • Processing Type: Sync or Async processing mode
If you need to change non-editable parameters, you’ll need to create a new deployment with the desired configuration.
Warmpool Deployment Limitations:
  • Warmpool deployments cannot be edited at all
  • Regular deployments cannot be converted into warmpool deployments
  • To modify a warmpool deployment, create a new one with the desired configuration

Update Process

Deployment edits are applied using a rolling update strategy to minimize downtime:
1

Validation

The system validates your changes before applying them
2

Gradual Rollout

New configuration is deployed incrementally across instances
3

Health Checks

Each updated instance is health-checked before proceeding
4

Completion

Once all instances are updated, the deployment is complete
Expected Behavior During Updates:
  • A small number of requests may be dropped during the rolling update
  • Avoid making major updates (such as changing the model or GPU configuration) during high-traffic hours to prevent disruption.

Automatic Rollback

The platform includes built-in safety mechanisms:
  • Automatic rollback: If an edit fails, the system automatically reverts to the previous working version
  • Manual rollback: Not currently supported after successful edits
  • 🔍 Health monitoring: Continuous checks ensure deployment stability
Monitor your deployment’s health metrics during and after edits to quickly identify any issues.

Troubleshooting

Common issues and solutions when editing deployments:
IssueCauseSolution
Edit Failed due to Validation ErrorInvalid configuration or incompatible parameters• Review error message for specific issues
• Verify model compatibility
• Check scaling parameter ranges
Edit Failed due to Resource Unavailable ErrorRequested resources (GPUs) not available• Choose a different accelerator type
• Reduce instance count
• Try again during off-peak hours
• Contact support for resource availability
Deployment Unstable After EditNew configuration causing issues• System should auto-rollback if health checks fail
• If not, create a new deployment with previous configuration
• Review deployment logs for error details
• Contact support if issues persist