Inference & Monitoring - Simplismart

Steps for Using the Deployed Model and Monitoring it’s Performance

Steps for Using the Deployed Model and Monitoring it’s Performance

Once the model is successfully deployed, you can follow these steps to begin inference:

Go to the API tab of your model
Find the Endpoint URL and the pre-generated inference script
Copy the script, replace placeholder values, and execute it to call the model

409dcd4f 2a72 42a8 9c79 18591977dd06 Pn

How to access your API Key?Go to Account Settings and select API Key

Alt Text

The Monitor tab provides an overview of your deployment’s performance.

Monitor Real-Time Status:

Pod Info: Status and count of active pods
Throughput & Latency: Requests per second and processing time
Success & Failure Rates: Percentage of successful and failed inferences

Image(24) Pn

2. Resource Monitoring: Various system level metrics can be tracked along with system load information; such as CPU/ GPU usage & request metrics.

Image(25) Pn

Deploying a Docker Container Deploying NIM

⌘I