Thanos Rule Grpc Error Rate warning
Thanos Rule {{$labels.job}} is failing to handle {{$value | humanize}}% of requests.
>>>
(sum by (job, instance) (
rate
(
{grpc_code=~"Unknown|ResourceExhausted|Internal|Unavailable|DataLoss|DeadlineExceeded", job=~".*thanos-rule.*"}[5m]))/ sum by (job, instance) (
rate
(
{job=~".*thanos-rule.*"}[5m])) * 100 > 5)
The rule calculates, for each Thanos‑rule job/instance, the percentage of gRPC requests that returned error codes (Unknown, ResourceExhausted, Internal, Unavailable, DataLoss, DeadlineExceeded) over the last 5 minutes, and triggers the alert when this error‑rate exceeds 5 %.
Get Alert✕
Download
Copy to Clipboard