K8s Cluster Metrics
68,633

Created 1/29/2020
Updated 1/29/2020
Revision 1
Categories
DockerHost Metrics
Grafana Version >=5.2.2
Datasources
Prometheus

Description

This dashboard monitors Kubernetes cluster health and workload status, aggregating pod, container, and job insights across the cluster. It highlights real-time pod lifecycle metrics and resource capacity vs. usage, with key panels like kube_pod_status_phase (to track Running, Pending, Succeeded, Failed, Unknown) and kube_node_status_allocatable_pods vs. kube_node_status_capacity_pods to assess capacity, plus container activity such as kube_pod_container_status_restarts_total to detect instability. Overall, it provides a consolidated view of pod activity, deployment/replication state, and job progress, enabling rapid identification of bottlenecks and failures.

Screenshots

Source Grafana.com

Used Metrics 32

  • container_cpu_usage_seconds_total

  • container_memory_usage_bytes

  • container_memory_working_set_bytes

  • container_network_receive_bytes_total

  • container_network_transmit_bytes_total

  • kube_deployment_spec_replicas

  • kube_deployment_status_replicas

  • kube_deployment_status_replicas_unavailable

  • kube_job_status_active

  • kube_job_status_failed

  • kube_job_status_succeeded

  • kube_node_info

  • kube_node_spec_unschedulable

  • kube_node_status_allocatable_cpu_cores

  • kube_node_status_allocatable_memory_bytes

  • kube_node_status_allocatable_pods

  • kube_node_status_capacity_cpu_cores

  • kube_node_status_capacity_pods

  • kube_node_status_condition

  • kube_pod_container_resource_limits_cpu_cores

  • kube_pod_container_resource_limits_memory_bytes

  • kube_pod_container_resource_requests_cpu_cores

  • kube_pod_container_resource_requests_memory_bytes

  • kube_pod_container_status_restarts_total

  • kube_pod_container_status_running

  • kube_pod_container_status_terminated

  • kube_pod_container_status_waiting

  • kube_pod_info

  • kube_pod_status_phase

  • kube_replicationcontroller_spec_replicas

  • kube_replicationcontroller_status_replicas

  • machine_cpu_cores

Get Dashboard
Download
Copy to Clipboard