Kubernetes / Kubelet
3,486,029

Created 6/1/2022
Updated 6/2/2022
Revision 1
Grafana Version >=8.5.3
Datasources
Prometheus

Description

This dashboard monitors the health and performance of kubelet processes across nodes, focusing on resource usage, lifecycle events, and operation timing. It highlights per-node activity with metrics like kubelet_running_pods and kubelet_running_containers, tracks volume management with volume_manager_total_volumes, and surfaces latency and error characteristics through metrics such as kubelet_runtime_operations_duration_seconds_bucket and kubelet_runtime_operations_errors_total to identify bottlenecks and reliability issues.

Source Grafana.com

Used Metrics 26

  • go_goroutines

  • kubelet_cgroup_manager_duration_seconds_bucket

  • kubelet_cgroup_manager_duration_seconds_count

  • kubelet_node_config_error

  • kubelet_node_name

  • kubelet_pleg_relist_duration_seconds_bucket

  • kubelet_pleg_relist_duration_seconds_count

  • kubelet_pleg_relist_interval_seconds_bucket

  • kubelet_pod_start_duration_seconds_count

  • kubelet_pod_worker_duration_seconds_bucket

  • kubelet_pod_worker_duration_seconds_count

  • kubelet_running_container_count

  • kubelet_running_containers

  • kubelet_running_pod_count

  • kubelet_running_pods

  • kubelet_runtime_operations_duration_seconds_bucket

  • kubelet_runtime_operations_errors_total

  • kubelet_runtime_operations_total

  • process_cpu_seconds_total

  • process_resident_memory_bytes

  • rest_client_request_duration_seconds_bucket

  • rest_client_requests_total

  • storage_operation_duration_seconds_bucket

  • storage_operation_duration_seconds_count

  • storage_operation_errors_total

  • volume_manager_total_volumes

Get Dashboard
Download
Copy to Clipboard