Kubernetes PDBs That Actually Budget
Kubernetes PDBs That Actually Budget
A PDB (PodDisruptionBudget) caps voluntary disruptions: node drains, rolling node-group updates, scaled-down ASGs. It does nothing about involuntary disruption (a node dying).
The footgun
minAvailable: 1 on a Deployment with 2 replicas means the cluster can evict one pod at a time. That's correct for stateless APIs. It's catastrophic for a consumer Deployment whose partition is pinned 1-to-1 with a pod — you'll lose in-flight work for whichever partition gets evicted.
Write PDBs backwards from the rollout strategy
- RollingUpdate with
maxUnavailable: 25%on 8 replicas → PDBminAvailable: 75%. - StatefulSet with a leader election → PDB
maxUnavailable: 1. - Leaderless queue consumers pinned 1:1 with partitions → PDB
minAvailable: 100%(yes really) and accept that node upgrades need a planned maintenance window.
The dashboard that catches the silent failures
Alert on kube_poddisruptionbudget_status_current_healthy < kube_poddisruptionbudget_status_desired_healthy continuously for > 10 minutes. A PDB in that state is why a drain is hanging on your cluster upgrade.