NVIDIA DCGM Exporter Dashboard 615,914615,914
5/2/2020
5/6/2020
1
>=6.7.3
Prometheus
Introduction
This dashboard displays GPU metrics collected from NVIDIA dcgm-exporter via a metric endpoint added to Prometheus. A separate endpoint is added to Prometheus via a Service Monitor.
Quickstart
helm install stable/prometheus-operator --generate-name \
--set "prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues=false"
kubectl create -f https://raw.githubusercontent.com/NVIDIA/gpu-monitoring-tools/2.0.0-rc.8/dcgm-exporter.yaml
kubectl create -f https://raw.githubusercontent.com/NVIDIA/gpu-monitoring-tools/2.0.0-rc.8/service-monitor.yaml
More information on the dcgm-exporter README.
Get Dashboard✕
Download
Copy to Clipboard
Used Metrics 77
DCGM_FI_DEV_GPU_TEMP
DCGM_FI_DEV_POWER_USAGE
DCGM_FI_DEV_SM_CLOCK
DCGM_FI_DEV_MEM_CLOCK
DCGM_FI_DEV_GPU_UTIL
DCGM_FI_DEV_MEM_COPY_UTIL
DCGM_FI_DEV_FB_USED