Steps for Using the Deployed Model and Monitoring it’s Performance

Once the model is successfully deployed, you can follow these steps to begin inference:
  • Go to the API tab of your model
  • Find the Endpoint URL and the pre-generated inference script
  • Copy the script, replace placeholder values, and execute it to call the model
409dcd4f 2a72 42a8 9c79 18591977dd06 Pn
How to access your API Key?Go to Account Settings and select API Key
Alt Text
The Monitor tab provides an overview of your deployment’s performance.
  1. Monitor Real-Time Status:
  • Pod Info: Status and count of active pods
  • Throughput & Latency: Requests per second and processing time
  • Success & Failure Rates: Percentage of successful and failed inferences
Image(24) Pn 2. Resource Monitoring: Various system level metrics can be tracked along with system load information; such as CPU/ GPU usage & request metrics. Image(25) Pn