1 - Deployment Performance & Health 185,459185,459
A parameterized dashboard for common workload types (deployment, daemonSet, statefulSet) that has charts that pull Prometheus metrics from Kubernetes, Istio, and node-exporter and visualizes metrics in several categories (by panel):
- At a Glance - A quick view of the health of your Kubernetes-based app (assumes it's web service, so it's mostly Istio metrics like success, latency, etc)
- RED (Requests, Errors, Duration) - SRE "Golden Signals" that come from Istio
- USE (Utilization, Saturation, Errors) - SRE "Golden Signals" that come from Kubernetes
- Infra Resources - POD distribution by host and AZ, HPA metrics, image tag, oomkills, CPU throttling, total deployment allocated CPU's & memory, and more
Select your account, cluster, namespace, and then your workload name, and all charts will render.
Used Metrics 2424
istio_requests_total
istio_request_duration_milliseconds_bucket
kube_deployment_status_replicas_available
-
container_cpu_cfs_throttled_seconds_total
kube_pod_container_info
-
container_cpu_usage_seconds_total
kube_pod_container_resource_requests
kube_pod_container_resource_limits
-
container_memory_working_set_bytes
kube_horizontalpodautoscaler_spec_target_metric
-
container_fs_reads_bytes_total
-
container_fs_writes_bytes_total
-
container_fs_reads_total
-
container_fs_writes_total
-
container_network_receive_bytes_total
-
container_network_transmit_bytes_total
envoy_server_total_connections
kube_horizontalpodautoscaler_status_current_replicas
kube_horizontalpodautoscaler_status_desired_replicas
kube_horizontalpodautoscaler_spec_min_replicas
kube_horizontalpodautoscaler_spec_max_replicas
kube_deployment_status_replicas_unavailable
kube_pod_container_status_restarts_total
-
node_vmstat_oom_kill