Prometheus target missing with warmup time critical
Allow a job time to start up (10 minutes) before alerting that it's down.
>>>
sum by (instance, job) ((
== 0) * on (instance) gro
_left(__name__) (
-
> 600))
This alert triggers when a Prometheus scrape target is reported as down (up == 0) and the underlying node hosting that target has been up for at least 10 minutes (node_time_seconds - node_boot_time_seconds > 600). This prevents alerts from firing immediately after a node (and its services) starts up, allowing for a warm-up period.
Get Alert✕
Download
Copy to Clipboard