Supported Metrics

The Simplismart metrics endpoint exposes metrics in Prometheus format covering Kubernetes infrastructure health, GPU utilization, inference engine performance, and request lifecycle.

For each metric, you only need to provide one of the listed parameters.

Deployment namespace and deployment slug refer to the same value. Learn how to find it here.

Kubernetes Deployment Metrics

Track deployment health and replica status.

`kube_deployment_spec_replicas`

The number of desired replicas for a deployment, as specified in the deployment spec. Type: gauge Labels:

namespace

label

Deployment namespace (deployment slug).

deployment

label

Deployment name.

`kube_deployment_status_replicas`

The number of observed replicas for a deployment. Type: gauge Labels:

namespace

label

Deployment namespace.

deployment

label

Deployment name.

`kube_deployment_status_replicas_available`

Number of replicas that are available (ready for at least minReadySeconds). Type: gauge Labels:

namespace

label

Deployment namespace.

deployment

label

Deployment name.

`kube_deployment_status_replicas_ready`

Number of replicas that have passed their readiness probes. Type: gauge Labels:

namespace

label

Deployment namespace.

deployment

label

Deployment name.

For inference servers, readiness means “model is loaded and accepting requests.” A container running but not ready consumes resources without serving traffic.

`kube_deployment_status_replicas_unavailable`

Number of replicas that are not yet available. Type: gauge Labels:

namespace

label

Deployment namespace.

deployment

label

Deployment name.

A non-zero value means capacity is degraded: pods may be crashing, stuck in image pull, or failing health checks.

Kubernetes Pod Metrics

Pod-level visibility into lifecycle and health.

`kube_pod_info`

Pod metadata information including node, IP, and phase. Type: gauge Labels:

namespace

label

Pod namespace.

pod

label

Pod name.

node

label

Node name where the pod is scheduled.

pod_ip

label

Pod IP address.

`kube_pod_status_phase`

Current phase of the pod (Pending, Running, Succeeded, Failed, Unknown). Type: gauge Labels:

namespace

label

Pod namespace.

pod

label

Pod name.

phase

label

Pod phase (Pending, Running, Succeeded, Failed, Unknown).

`kube_pod_container_status_ready`

Whether the container is ready (1) or not (0). Type: gauge Labels:

namespace

label

Pod namespace.

pod

label

Pod name.

container

label

Container name.

`kube_pod_container_status_running`

Whether the container is running (1) or not (0). Type: gauge Labels:

namespace

label

Pod namespace.

pod

label

Pod name.

container

label

Container name.

`kube_pod_container_status_restarts_total`

Cumulative count of container restarts. Type: counter Labels:

namespace

label

Pod namespace.

pod

label

Pod name.

container

label

Container name.

Frequent restarts indicate OOM kills, crash loops, or probe failures. For GPU inference pods, restarts are expensive because model loading can take minutes.

`kube_pod_container_status_waiting_reason`

The reason the container is in waiting state (e.g., ContainerCreating, CrashLoopBackOff, ErrImagePull). Type: gauge Labels:

namespace

label

Pod namespace.

pod

label

Pod name.

container

label

Container name.

reason

label

Reason for waiting state.

Container Resource Metrics

Track actual resource consumption.

`container_cpu_usage_seconds_total`

Cumulative CPU time consumed by the container, in core-seconds. Type: counter Labels:

namespace

label

Pod namespace.

pod

label

Pod name.

container

label

Container name.

`container_memory_working_set_bytes`

Current working set memory of the container in bytes. This is what the OOM killer uses for eviction decisions. Type: gauge Labels:

namespace

label

Pod namespace.

pod

label

Pod name.

container

label

Container name.

When this approaches the container’s memory limit, OOM kills become imminent.

`kube_pod_container_resource_requests`

Resource requests configured for the container. Type: gauge Labels:

namespace

label

Pod namespace.

pod

label

Pod name.

container

label

Container name.

resource

label

Resource type (cpu, memory, nvidia_com_gpu).

unit

label

Unit of measurement (core, byte, etc.).

`kube_pod_container_resource_limits`

Resource limits configured for the container. Type: gauge Labels:

namespace

label

Pod namespace.

pod

label

Pod name.

container

label

Container name.

resource

label

Resource type (cpu, memory, nvidia_com_gpu).

unit

label

Unit of measurement (core, byte, etc.).

GPU Metrics

GPU metrics for Simplismart’s GPU-accelerated inference workloads.

`DCGM_FI_DEV_GPU_UTIL`

GPU utilization as a percentage (0–100). Measures what fraction of time the GPU’s streaming multiprocessors are active. Type: gauge Labels:

gpu

label

GPU index.

UUID

label

GPU UUID.

device

label

Device identifier.

modelName

label

GPU model name (e.g., “NVIDIA H100”).

Hostname

label

Host name.

exported_namespace

label

Deployment namespace.

pod

label

Pod name.

Example:

DCGM_FI_DEV_GPU_UTIL{Hostname="simplismart-dell-001",exported_namespace="your-namespace",pod="nvidia-dcgm-exporter-xyz"}

Low utilization means inefficient batching; high sustained utilization (>90%) indicates the GPU is at capacity.

`DCGM_FI_DEV_FB_USED`

Frame buffer (GPU VRAM) memory used, in MiB. Type: gauge Labels:

gpu

label

GPU index.

Hostname

label

Host name.

exported_namespace

label

Deployment namespace.

pod

label

Pod name.

Example:

DCGM_FI_DEV_FB_USED{Hostname="simplismart-dell-001",exported_namespace="your-namespace",pod="nvidia-dcgm-exporter-xyz"}

GPU memory is the primary constraint for model serving. When VRAM is exhausted, inference requests fail with OOM errors.

`DCGM_FI_DEV_FB_FREE`

Frame buffer (GPU VRAM) memory free, in MiB. Type: gauge Labels:

Hostname

label

Host name.

exported_namespace

label

Deployment namespace.

pod

label

Pod name.

Example:

DCGM_FI_DEV_FB_FREE{Hostname="simplismart-dell-001",exported_namespace="your-namespace",pod="nvidia-dcgm-exporter-xyz"}

Ingress Metrics

External traffic entering the Simplismart platform.

`nginx_ingress_controller_requests`

Total number of requests handled by the NGINX ingress controller. Type: counter Labels:

ingress

label

Ingress name.

exported_namespace

label

Namespace.

service

label

Service name.

status

label

HTTP status code.

method

label

HTTP method (GET, POST, etc.).

path

label

Request path.

Available metrics:

nginx_ingress_controller_requests - Total request count
nginx_ingress_controller_request_duration_seconds_bucket - Request duration histogram buckets
nginx_ingress_controller_request_duration_seconds_sum - Total request duration sum
nginx_ingress_controller_request_duration_seconds_count - Total request count for duration

`nginx_ingress_controller_request_duration_seconds`

End-to-end request duration as observed by the ingress controller. Type: histogram (exposes _bucket, _sum, and _count time series) Labels:

ingress

label

Ingress name.

exported_namespace

label

Namespace.

service

label

Service name.

status

label

HTTP status code.

method

label

HTTP method (GET, POST, etc.).

path

label

Request path.

Example query (p99):

histogram_quantile(0.99, rate(nginx_ingress_controller_request_duration_seconds_bucket[5m]))

​Supported Metrics

​Kubernetes Deployment Metrics

​kube_deployment_spec_replicas

​kube_deployment_status_replicas

​kube_deployment_status_replicas_available

​kube_deployment_status_replicas_ready

​kube_deployment_status_replicas_unavailable

​Kubernetes Pod Metrics

​kube_pod_info

​kube_pod_status_phase

​kube_pod_container_status_ready

​kube_pod_container_status_running

​kube_pod_container_status_restarts_total

​kube_pod_container_status_waiting_reason

​Container Resource Metrics

​container_cpu_usage_seconds_total

​container_memory_working_set_bytes

​kube_pod_container_resource_requests

​kube_pod_container_resource_limits

​GPU Metrics

​DCGM_FI_DEV_GPU_UTIL

​DCGM_FI_DEV_FB_USED

​DCGM_FI_DEV_FB_FREE

​Ingress Metrics

​nginx_ingress_controller_requests

​nginx_ingress_controller_request_duration_seconds

Supported Metrics

Kubernetes Deployment Metrics

`kube_deployment_spec_replicas`

`kube_deployment_status_replicas`

`kube_deployment_status_replicas_available`

`kube_deployment_status_replicas_ready`

`kube_deployment_status_replicas_unavailable`

Kubernetes Pod Metrics

`kube_pod_info`

`kube_pod_status_phase`

`kube_pod_container_status_ready`

`kube_pod_container_status_running`

`kube_pod_container_status_restarts_total`

`kube_pod_container_status_waiting_reason`

Container Resource Metrics

`container_cpu_usage_seconds_total`

`container_memory_working_set_bytes`

`kube_pod_container_resource_requests`

`kube_pod_container_resource_limits`

GPU Metrics

`DCGM_FI_DEV_GPU_UTIL`

`DCGM_FI_DEV_FB_USED`

`DCGM_FI_DEV_FB_FREE`

Ingress Metrics

`nginx_ingress_controller_requests`

`nginx_ingress_controller_request_duration_seconds`