Promtail request errors critical
The {{ $labels.job }} {{ $labels.route }} is experiencing {{ printf "%.2f" $value }}% errors.
>>>
100 *
sum
(
rate
(
{status_code=~"5..|failed"}[1m])) by (namespace, job, route, instance) /
sum
(
rate
(
[1m])) by (namespace, job, route, instance) > 10
The rule calculates the percentage of Promtail requests that resulted in a 5xx status code or a "failed" status over the past minute (using rate to get per‑second request counts), groups this by namespace, job, route, and instance, and triggers when that error rate exceeds 10 %. In other words, if more than 10 % of a given job/route’s requests are errors in the last minute, the alert fires.
Get Alert✕
Download
Copy to Clipboard