Thanos Rule Alertmanager High D N S Failures
warning

Description Thanos Rule {{$labels.instance}} has {{$value | humanize}}% of failing DNS queries for Alertmanager endpoints.
Query for 15m
>>>
	
				
					(sum by (job, instance) (
				
			
				
					
					
						rate
					
				
			
				
					(
				
			
				
					
				
			
				
					{job=~".*thanos-rule.*"}[5m])) / sum by (job, instance) (
				
			
				
					
					
						rate
					
				
			
				
					(
				
			
				
					
				
			
				
					{job=~".*thanos-rule.*"}[5m])) * 100 > 1)
				
			
    
Query Explanation

The rule computes the 5‑minute rate of thanos_rule_alertmanagers_dns_failures_total divided by the 5‑minute rate of thanos_rule_alertmanagers_dns_lookups_total for each Thanos‑rule job/instance, multiplies by 100 to get a failure‑percentage, and fires when that percentage exceeds 1 %. In short, if more than one percent of the DNS queries made by a Thanos‑rule instance to Alertmanager fail in the last five minutes, the alert triggers.

Get Alert
Download
Copy to Clipboard