> ## Documentation Index
> Fetch the complete documentation index at: https://docs.simplismart.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Supported Metrics

> Reference for all metrics available from the Simplismart metrics endpoint

# Supported Metrics

The Simplismart metrics endpoint exposes metrics in Prometheus format covering Kubernetes infrastructure health, GPU utilization, inference engine performance, and request lifecycle.

<Note>
  For each metric, you only need to provide one of the listed parameters.
</Note>

<Note>
  Deployment namespace and deployment slug refer to the same value. Learn how to find it [here](/images/observability/overview/1-find-slug.png).
</Note>

## Kubernetes Deployment Metrics

Track deployment health and replica status.

### `kube_deployment_spec_replicas`

The number of desired replicas for a deployment, as specified in the deployment spec.

Type: `gauge`

Labels:

<ParamField query="namespace" type="label">
  Deployment namespace (deployment slug).
</ParamField>

<ParamField query="deployment" type="label">
  Deployment name.
</ParamField>

***

### `kube_deployment_status_replicas`

The number of observed replicas for a deployment.

Type: `gauge`

Labels:

<ParamField query="namespace" type="label">
  Deployment namespace.
</ParamField>

<ParamField query="deployment" type="label">
  Deployment name.
</ParamField>

***

### `kube_deployment_status_replicas_available`

Number of replicas that are available (ready for at least minReadySeconds).

Type: `gauge`

Labels:

<ParamField query="namespace" type="label">
  Deployment namespace.
</ParamField>

<ParamField query="deployment" type="label">
  Deployment name.
</ParamField>

***

### `kube_deployment_status_replicas_ready`

Number of replicas that have passed their readiness probes.

Type: `gauge`

Labels:

<ParamField query="namespace" type="label">
  Deployment namespace.
</ParamField>

<ParamField query="deployment" type="label">
  Deployment name.
</ParamField>

<Info>
  For inference servers, readiness means "model is loaded and accepting requests." A container running but not ready consumes resources without serving traffic.
</Info>

***

### `kube_deployment_status_replicas_unavailable`

Number of replicas that are not yet available.

Type: `gauge`

Labels:

<ParamField query="namespace" type="label">
  Deployment namespace.
</ParamField>

<ParamField query="deployment" type="label">
  Deployment name.
</ParamField>

<Warning>
  A non-zero value means capacity is degraded: pods may be crashing, stuck in image pull, or failing health checks.
</Warning>

***

## Kubernetes Pod Metrics

Pod-level visibility into lifecycle and health.

### `kube_pod_info`

Pod metadata information including node, IP, and phase.

Type: `gauge`

Labels:

<ParamField query="namespace" type="label">
  Pod namespace.
</ParamField>

<ParamField query="pod" type="label">
  Pod name.
</ParamField>

<ParamField query="node" type="label">
  Node name where the pod is scheduled.
</ParamField>

<ParamField query="pod_ip" type="label">
  Pod IP address.
</ParamField>

***

### `kube_pod_status_phase`

Current phase of the pod (Pending, Running, Succeeded, Failed, Unknown).

Type: `gauge`

Labels:

<ParamField query="namespace" type="label">
  Pod namespace.
</ParamField>

<ParamField query="pod" type="label">
  Pod name.
</ParamField>

<ParamField query="phase" type="label">
  Pod phase (Pending, Running, Succeeded, Failed, Unknown).
</ParamField>

***

### `kube_pod_container_status_ready`

Whether the container is ready (1) or not (0).

Type: `gauge`

Labels:

<ParamField query="namespace" type="label">
  Pod namespace.
</ParamField>

<ParamField query="pod" type="label">
  Pod name.
</ParamField>

<ParamField query="container" type="label">
  Container name.
</ParamField>

***

### `kube_pod_container_status_running`

Whether the container is running (1) or not (0).

Type: `gauge`

Labels:

<ParamField query="namespace" type="label">
  Pod namespace.
</ParamField>

<ParamField query="pod" type="label">
  Pod name.
</ParamField>

<ParamField query="container" type="label">
  Container name.
</ParamField>

***

### `kube_pod_container_status_restarts_total`

Cumulative count of container restarts.

Type: `counter`

Labels:

<ParamField query="namespace" type="label">
  Pod namespace.
</ParamField>

<ParamField query="pod" type="label">
  Pod name.
</ParamField>

<ParamField query="container" type="label">
  Container name.
</ParamField>

<Tip>
  Frequent restarts indicate OOM kills, crash loops, or probe failures. For GPU inference pods, restarts are expensive because model loading can take minutes.
</Tip>

***

### `kube_pod_container_status_waiting_reason`

The reason the container is in waiting state (e.g., ContainerCreating, CrashLoopBackOff, ErrImagePull).

Type: `gauge`

Labels:

<ParamField query="namespace" type="label">
  Pod namespace.
</ParamField>

<ParamField query="pod" type="label">
  Pod name.
</ParamField>

<ParamField query="container" type="label">
  Container name.
</ParamField>

<ParamField query="reason" type="label">
  Reason for waiting state.
</ParamField>

***

## Container Resource Metrics

Track actual resource consumption.

### `container_cpu_usage_seconds_total`

Cumulative CPU time consumed by the container, in core-seconds.

Type: `counter`

Labels:

<ParamField query="namespace" type="label">
  Pod namespace.
</ParamField>

<ParamField query="pod" type="label">
  Pod name.
</ParamField>

<ParamField query="container" type="label">
  Container name.
</ParamField>

***

### `container_memory_working_set_bytes`

Current working set memory of the container in bytes. This is what the OOM killer uses for eviction decisions.

Type: `gauge`

Labels:

<ParamField query="namespace" type="label">
  Pod namespace.
</ParamField>

<ParamField query="pod" type="label">
  Pod name.
</ParamField>

<ParamField query="container" type="label">
  Container name.
</ParamField>

<Warning>
  When this approaches the container's memory limit, OOM kills become imminent.
</Warning>

***

### `kube_pod_container_resource_requests`

Resource requests configured for the container.

Type: `gauge`

Labels:

<ParamField query="namespace" type="label">
  Pod namespace.
</ParamField>

<ParamField query="pod" type="label">
  Pod name.
</ParamField>

<ParamField query="container" type="label">
  Container name.
</ParamField>

<ParamField query="resource" type="label">
  Resource type (cpu, memory, nvidia\_com\_gpu).
</ParamField>

<ParamField query="unit" type="label">
  Unit of measurement (core, byte, etc.).
</ParamField>

***

### `kube_pod_container_resource_limits`

Resource limits configured for the container.

Type: `gauge`

Labels:

<ParamField query="namespace" type="label">
  Pod namespace.
</ParamField>

<ParamField query="pod" type="label">
  Pod name.
</ParamField>

<ParamField query="container" type="label">
  Container name.
</ParamField>

<ParamField query="resource" type="label">
  Resource type (cpu, memory, nvidia\_com\_gpu).
</ParamField>

<ParamField query="unit" type="label">
  Unit of measurement (core, byte, etc.).
</ParamField>

***

## GPU Metrics

GPU metrics for Simplismart's GPU-accelerated inference workloads.

### `DCGM_FI_DEV_GPU_UTIL`

GPU utilization as a percentage (0–100). Measures what fraction of time the GPU's streaming multiprocessors are active.

Type: `gauge`

Labels:

<ParamField query="gpu" type="label">
  GPU index.
</ParamField>

<ParamField query="UUID" type="label">
  GPU UUID.
</ParamField>

<ParamField query="device" type="label">
  Device identifier.
</ParamField>

<ParamField query="modelName" type="label">
  GPU model name (e.g., "NVIDIA H100").
</ParamField>

<ParamField query="Hostname" type="label">
  Host name.
</ParamField>

<ParamField query="exported_namespace" type="label">
  Deployment namespace.
</ParamField>

<ParamField query="pod" type="label">
  Pod name.
</ParamField>

**Example:**

```promql theme={null}
DCGM_FI_DEV_GPU_UTIL{Hostname="simplismart-dell-001",exported_namespace="your-namespace",pod="nvidia-dcgm-exporter-xyz"}
```

<Tip>
  Low utilization means inefficient batching; high sustained utilization (>90%) indicates the GPU is at capacity.
</Tip>

***

### `DCGM_FI_DEV_FB_USED`

Frame buffer (GPU VRAM) memory used, in MiB.

Type: `gauge`

Labels:

<ParamField query="gpu" type="label">
  GPU index.
</ParamField>

<ParamField query="Hostname" type="label">
  Host name.
</ParamField>

<ParamField query="exported_namespace" type="label">
  Deployment namespace.
</ParamField>

<ParamField query="pod" type="label">
  Pod name.
</ParamField>

**Example:**

```promql theme={null}
DCGM_FI_DEV_FB_USED{Hostname="simplismart-dell-001",exported_namespace="your-namespace",pod="nvidia-dcgm-exporter-xyz"}
```

<Info>
  GPU memory is the primary constraint for model serving. When VRAM is exhausted, inference requests fail with OOM errors.
</Info>

***

### `DCGM_FI_DEV_FB_FREE`

Frame buffer (GPU VRAM) memory free, in MiB.

Type: `gauge`

Labels:

<ParamField query="Hostname" type="label">
  Host name.
</ParamField>

<ParamField query="exported_namespace" type="label">
  Deployment namespace.
</ParamField>

<ParamField query="pod" type="label">
  Pod name.
</ParamField>

**Example:**

```promql theme={null}
DCGM_FI_DEV_FB_FREE{Hostname="simplismart-dell-001",exported_namespace="your-namespace",pod="nvidia-dcgm-exporter-xyz"}
```

***

## Ingress Metrics

External traffic entering the Simplismart platform.

### `nginx_ingress_controller_requests`

Total number of requests handled by the NGINX ingress controller.

Type: `counter`

Labels:

<ParamField query="ingress" type="label">
  Ingress name.
</ParamField>

<ParamField query="exported_namespace" type="label">
  Namespace.
</ParamField>

<ParamField query="service" type="label">
  Service name.
</ParamField>

<ParamField query="status" type="label">
  HTTP status code.
</ParamField>

<ParamField query="method" type="label">
  HTTP method (GET, POST, etc.).
</ParamField>

<ParamField query="path" type="label">
  Request path.
</ParamField>

**Available metrics:**

* `nginx_ingress_controller_requests` - Total request count
* `nginx_ingress_controller_request_duration_seconds_bucket` - Request duration histogram buckets
* `nginx_ingress_controller_request_duration_seconds_sum` - Total request duration sum
* `nginx_ingress_controller_request_duration_seconds_count` - Total request count for duration

***

### `nginx_ingress_controller_request_duration_seconds`

End-to-end request duration as observed by the ingress controller.

Type: `histogram` (exposes `_bucket`, `_sum`, and `_count` time series)

Labels:

<ParamField query="ingress" type="label">
  Ingress name.
</ParamField>

<ParamField query="exported_namespace" type="label">
  Namespace.
</ParamField>

<ParamField query="service" type="label">
  Service name.
</ParamField>

<ParamField query="status" type="label">
  HTTP status code.
</ParamField>

<ParamField query="method" type="label">
  HTTP method (GET, POST, etc.).
</ParamField>

<ParamField query="path" type="label">
  Request path.
</ParamField>

**Example query (p99):**

```promql theme={null}
histogram_quantile(0.99, rate(nginx_ingress_controller_request_duration_seconds_bucket[5m]))
```

***
