Why I Run Traefik With Static Config in Production
Why I Run Traefik With Static Config in Production
Dynamic config looks flexible on the slide deck, but every outage I've traced in the last year came from a CRD reconcile loop. So I'm defaulting to static config now — and this post walks through why.
The pitch, and its quiet cost
Dynamic providers let Traefik discover services the moment they appear: label a deployment, and the IngressRoute materialises. Beautiful in staging; brittle in prod. Every controller you wire up adds:
- A reconcile loop that cannot be dry-run from the operator's laptop.
- A fan-in of changes from every namespace a serviceaccount can see.
- A surface for "the operator typed the wrong label" to become "every request 502'd for 90 seconds."
The pattern I've landed on
Static traefik.yml + a file-provider watching /etc/traefik/dynamic/*.yml. Every route for every service is in a YAML file that lives next to the systemd unit of the service that owns it. Changes are:
- Edit the YAML file in the repo that owns the service.
- Open a PR. The diff is reviewable — each file is scoped to one host.
make deployrsyncs the YAML onto the file provider directory. Traefik hot-reloads.
http:
routers:
api:
rule: Host(`api.example.com`) && PathPrefix(`/api/v1`)
service: api
tls:
certResolver: letsencrypt
services:
api:
loadBalancer:
servers:
- url: "http://127.0.0.1:8200"
What I keep dynamic
Two things earn dynamic config: cert rotation (ACME via Let's Encrypt) and rate-limit config for the auth endpoint. Everything else is declarative YAML under git.
The concrete wins
- A route review is a file diff, not a controller-state diff.
- The on-call pager went quiet for ingress-shaped incidents.
- Rolling back a bad route is
git revert && make deploy— nokubectl rollout undorequired.
Static config isn't sexy. It is, however, readable at 3am.