Cluster overview 47,78847,788
Description
This dashboard monitors a Nomad-based cluster by aggregating host- and datacenter-level metrics to track resource utilization, allocation status, and job health. It highlights CPU and memory allocations with panels like nomad_client_allocs_cpu_allocated, nomad_client_allocs_oom_killed, and nomad_client_allocs_mem_allocated (through related mem metrics), as well as cluster uptime and node status via up and timestamp signals, helping operators identify unallocated resources (nomad_client_unallocated_cpu, nomad_client_unallocated_mem) and allocation progress (nomad_nomad_job_status_running, nomad_nomad_job_status_pending, nomad_nomad_job_summary_complete/failed/lost). Other key views include disk space and free memory monitoring (Free Disk Space, Free memmory) and allocation health (Allocation status, Logs).
Screenshots
Used Metrics 2828
nomad_client_allocated_cpu
nomad_client_allocated_memory
nomad_client_allocations_blocked
nomad_client_allocations_migrating
nomad_client_allocations_pending
nomad_client_allocations_running
nomad_client_allocations_terminal
nomad_client_allocs_cpu_allocated
nomad_client_allocs_memory_allocated
nomad_client_allocs_oom_killed
nomad_client_host_disk_available
nomad_client_host_disk_size
nomad_client_host_memory_free
nomad_client_host_memory_total
nomad_client_unallocated_cpu
nomad_client_unallocated_memory
nomad_client_uptime
nomad_nomad_job_status_dead
nomad_nomad_job_status_pending
nomad_nomad_job_status_running
nomad_nomad_job_summary_complete
nomad_nomad_job_summary_failed
nomad_nomad_job_summary_lost
nomad_nomad_job_summary_queued
nomad_nomad_job_summary_running
nomad_nomad_job_summary_starting
timestamp
-
up