ZooKeeper by Prometheus
139,841 3.5 (2 reviews)

Created 6/30/2019
Updated 6/9/2021
Revision 4
Grafana Version >=6.2.2

Description

This dashboard monitors the health and performance of a ZooKeeper deployment by visualizing key operational metrics from Prometheus. It emphasizes critical lifecycle and I/O metrics such as znode_count/ephemerals_count for data store health, global_sessions and local_sessions for client connectivity, and write_per_namespace_sum / read_per_namespace_sum to assess throughput and namespace-level activity. Additional panels cover latency and timing aspects like startup_snap_load_time, startup_txns_loaded, and fsync_time to help pinpoint startup bottlenecks and I/O performance, while error and reliability indicators such as unrecoverable_error_count and digest_mismatches_count provide quick health signals.

Screenshots

Source Grafana.com

Used Metrics 86

  • ack_latency_sum

  • approximate_data_size

  • avg_latency

  • close_session_prep_time_sum

  • commit_commit_proc_req_queued_sum

  • commit_process_time_sum

  • commit_propagation_latency_sum

  • concurrent_request_processing_in_commit_processor_sum

  • dbinittime

  • dbinittime_count

  • dbinittime_sum

  • digest_mismatches_count

  • ensemble_auth_fail

  • ensemble_auth_skip

  • ensemble_auth_success

  • ephemerals_count

  • fsynctime

  • fsynctime_count

  • fsynctime_sum

  • global_sessions

  • jvm_classes_loaded

  • jvm_gc_collection_seconds_sum

  • jvm_memory_pool_bytes_used

  • jvm_pause_time_ms_sum

  • jvm_threads_current

  • jvm_threads_deadlocked

  • jvm_threads_state

  • local_sessions

  • local_write_committed_time_ms_sum

  • max_latency

  • min_latency

  • open_file_descriptor_count

  • outstanding_changes_queued

  • outstanding_changes_removed

  • outstanding_tls_handshake

  • packets_received

  • packets_sent

  • pending_session_queue_size_sum

  • prep_process_time_sum

  • prep_processor_queue_size_sum

  • prep_processor_queue_time_ms_sum

  • prep_processor_request_queued

  • propagation_latency_sum

  • proposal_ack_creation_latency

  • proposal_latency_sum

  • quorum_ack_latency_sum

  • read_commit_proc_issued_sum

  • read_commit_proc_req_queued_sum

  • read_commitproc_time_ms_sum

  • read_final_proc_time_ms_sum

  • read_per_namespace_sum

  • readlatency_sum

  • reads_after_write_in_session_queue_sum

  • reads_issued_from_session_queue_sum

  • request_commit_queued

  • requests_in_session_queue_sum

  • response_packet_cache_hits

  • response_packet_cache_misses

  • response_packet_get_children_cache_hits

  • response_packet_get_children_cache_misses

  • server_write_committed_time_ms_sum

  • session_queues_drained_sum

  • snapshottime_count

  • snapshottime_sum

  • startup_snap_load_time

  • startup_snap_load_time_count

  • startup_snap_load_time_sum

  • startup_txns_loaded

  • startup_txns_loaded_count

  • startup_txns_loaded_sum

  • sync_process_time_sum

  • sync_processor_batch_size_sum

  • sync_processor_queue_flush_time_ms_sum

  • sync_processor_queue_size_sum

  • sync_processor_request_queued

  • time_waiting_empty_pool_in_commit_processor_read_ms_sum

  • tls_handshake_exceeded

  • unrecoverable_error_count

  • updatelatency_sum

  • write_batch_time_in_commit_processor_sum

  • write_commit_proc_issued_sum

  • write_commit_proc_req_queued_sum

  • write_commitproc_time_ms_sum

  • write_final_proc_time_ms_sum

  • write_per_namespace_sum

  • znode_count

Get Dashboard
Download
Copy to Clipboard