# Infra roadmap Prioritized backlog for the cnx-network clan. See `docs/` for how the current pieces work. ## 1. Alerting (done — pending deploy) Rules evaluated by vmalert against VictoriaMetrics on control, declared in `modules/monitoring/alerts.nix`: - [x] SOA serial divergence between ns1 and ns2 (secondary out of sync) - [x] Zone-expiry countdown on the secondary approaching zero (transfers failing) - [x] Any scrape target down (`up == 0`) - [x] Root filesystem nearly full Delivery stays minimal for now (`notifier.blackhole`): vmalert remote-writes alert state back to VM, so firing alerts show up as the `ALERTS` series in Grafana. Wiring a real notifier (Matrix) is a later step — drop `blackhole` and set `settings."notifier.url"` to an Alertmanager. ## 2. Backups of critical state - [ ] DNSSEC key material on ns1 (KSK/ZSK in Knot's KASP store) — losing it forces an emergency DS rollover at the registrar - [ ] VictoriaMetrics TSDB on control (optional, retention is 180d) ## 3. Blackbox DNS probing - [ ] `blackbox_exporter` on control doing real DNS + DNSSEC-validation queries against ns1/ns2 — catches outside-in resolution failures the Knot stats miss ## 4. Third secondary off Hetzner (resilience) - [ ] A secondary nameserver on a different provider/network so a single-provider outage doesn't take all authoritative DNS down (architectural — new machine) ## 5. Centralized logs - [ ] VictoriaLogs on control to grep journald across all three hosts, pairing with the existing VictoriaMetrics setup