Loki stack monitoring (Promtail, Loki)

367,913

Created 10/16/2023

Updated 10/16/2023

Revision 1

Grafana Version >=10.0.3

Datasources

PrometheusLoki

Description

This dashboard monitors a Loki + Promtail deployment, focusing on log ingestion health, resource usage, and alerting status. It highlights log flow reliability with metrics like loki_distributor_lines_received_total, loki_distributor_bytes_received_total, and error/warn levels via loki_log_messages_total across level, warn, and error. It also tracks resource consumption and efficiency with % of memory and CPU usage for both Loki and Promtail, and surfaces ingestion failures through loki_distributor_ingester_append_failures_total and dropped entries via promtail_dropped_entries_total to pinpoint bottlenecks in the pipeline.

Used Metrics 15

container_cpu_usage_seconds_total
container_memory_working_set_bytes
error
kube_pod_container_resource_limits_cpu_cores
kube_pod_container_resource_limits_memory_bytes
kube_pod_container_resource_requests_cpu_cores
kube_pod_container_resource_requests_memory_bytes
level
loki_distributor_bytes_received_total
loki_distributor_ingester_append_failures_total
loki_distributor_lines_received_total
loki_ingester_memory_streams
loki_log_messages_total
promtail_dropped_entries_total
warn