ClickHouse and Keeper Comprehensive Dashboard
808

Created 4/18/2025
Updated 4/19/2025
Revision 1
Categories
Databases
Grafana Version >=9.1.4
Datasources
Prometheus

ClickHouse and Keeper Comprehensive Dashboard

✅ Overview

  • This dashboard is designed to monitor and visualize ClickHouse server metrics, using Prometheus as the data source.
  • It leverages ClickHouse’s internal /metrics endpoint to provide deep insights into system health, resource usage, and storage status.

✅ Key Features:

  • Cluster Overview

    • Displays node state, version, uptime, number of tables and databases.
    • Shows whether nodes are in read-only replica mode and Kafka consumer counts.
  • Resource Monitoring

    • Tracks CPU and memory usage at both ClickHouse process and OS levels.
    • Includes charts for system memory buffers, cache, and available memory.
  • Disk Usage Insights

    • Visualizes total, used, available, and unreserved disk space for each mount point.
    • Includes trends in disk usage over time (daily and per-10-minute deltas).
  • Real-Time Visualization

    • Interactive time range and instance filtering.
    • Supports multiple visual types: Table, Gauge, Graph, and Timeseries panels.
  • Highly Extensible

    • Uses template variables like ${Cluster}, ${Instance}, and ${datasource} to support mulati-cluster setups.
    • Modular and portable—easy to deploy across environments.
  • Quick Links

    • External links to the ClickHouse metrics page, documentation, and optional internal wikis for metric explanations.
  • Compatibility

    • Built with Grafana 9.1.4, compatible with Prometheus and ClickHouse /metrics endpoints.
Export Dashboard
Download
Copy to Clipboard

Used Metrics 41

  • ClickHouseMetrics_VersionInteger

  • ClickHouseMetrics_Revision

  • ClickHouseAsyncMetrics_NumberOfTables

  • ClickHouseAsyncMetrics_Uptime

  • ClickHouseAsyncMetrics_NumberOfDatabases

  • ClickHouseMetrics_ReadonlyReplica

  • ClickHouseMetrics_KafkaConsumers

  • ClickHouseAsyncMetrics_OSMemoryTotal

  • ClickHouseAsyncMetrics_OSMemoryFreePlusCached

  • ClickHouseAsyncMetrics_OSIdleTimeNormalized

  • ClickHouseAsyncMetrics_OSUptime

  • disk

  • ClickHouseAsyncMetrics_DiskAvailable_

  • ClickHouseAsyncMetrics_DiskUsed_

  • ClickHouseAsyncMetrics_DiskTotal_

  • ClickHouseAsyncMetrics_DiskUnreserved_

  • ClickHouseMetrics_MemoryTracking

  • ClickHouseAsyncMetrics_OSMemoryAvailable

  • ClickHouseAsyncMetrics_OSMemoryBuffers

  • node_load1

  • instance:node_cpu:ratio

  • ClickHouseAsyncMetrics_OSIOWaitTimeNormalized

  • node_network_transmit_bytes_total

  • node_network_receive_bytes_total

  • ClickHouseAsyncMetrics_OSUserTimeNormalized

  • ClickHouseAsyncMetrics_OSMemoryCached

  • ClickHouseAsyncMetrics_OSMemoryFreeWithoutCached

  • ClickHouseAsyncMetrics_OSSystemTimeNormalized

  • node_load5

  • node_load15

  • ClickHouseMetrics_TCPConnection

  • ClickHouseMetrics_HTTPConnection

  • node_filefd_allocated

  • ClickHouseAsyncMetrics_OSOpenFiles

  • ClickHouseProfileEvents_SlowRead

  • __rate_interval:

  • ClickHouseProfileEvents_ReadBackoff

  • ClickHouseErrorMetric_ACCESS_STORAGE_READONLY

  • ClickHouseAsyncMetrics_TotalRowsOfMergeTreeTables

  • ClickHouseAsyncMetrics_TotalPartsOfMergeTreeTables

  • ClickHouseAsyncMetrics_TotalBytesOfMergeTreeTables