Add mdBook infra runbook served by Caddy on control

Docs live in docs/ (DNS, ZeroTier mesh, monitoring), built at Nix-build time and
served as static files over the ZeroTier mesh on control:8080. Commit-to-edit:
change the markdown and redeploy to publish.
This commit is contained in:
Berwn
2026-06-17 14:26:21 +07:00
parent 3a8fe660a5
commit a7d4c0e567
8 changed files with 221 additions and 0 deletions
+59
View File
@@ -0,0 +1,59 @@
# DNS
Authoritative DNS for three zones, served by Knot:
- `cnx.network`
- `buildfor.life`
- `cnx.email`
Add a zone in `modules/dns/domains.nix` **and** drop a matching `<domain>.zone`
file in `modules/dns/zones/`.
## Primary / secondary
- **`ns1` = primary (master).** Loads each zone from its file, signs it, and
notifies `ns2`. Config in `machines/ns1/configuration.nix`.
- **`ns2` = secondary (slave).** Pulls every zone from `ns1` (AXFR/IXFR) and
accepts its NOTIFY. Config in `machines/ns2/configuration.nix`.
Zone transfers run **over the ZeroTier mesh**, authenticated with a shared TSIG
key (`dns-tsig`, a clan var copied to both machines).
## Serial handling
`ns1` uses `zonefile-load = difference-no-serial` with `serial-policy = unixtime`:
edit records without touching the SOA serial — Knot diffs the file, assigns a
strictly-monotonic unixtime serial, signs, and transfers. `journal-content = all`
holds the live signed zone (required by `difference-no-serial`).
## DNSSEC
Automatic signing on `ns1` only, policy `cnx`: ECDSA P-256/SHA-256. The ZSK
auto-rolls; the KSK is kept stable, so the DS at the registrar only changes on a
manual KSK rollover.
> **Pending (manual):** submit DS records for `buildfor.life` and `cnx.email`
> once they're at a DNSSEC-capable registrar.
## ACME DNS-01
A dedicated TSIG key (`acme_ddns`), scoped by `acl_acme` to `TXT` updates at or
under `_acme-challenge.<zone>` on `ns1` only. Knot signs the record and transfers
it to `ns2`, which never needs this key. Retrieve the client config with:
```
clan vars get ns1 dns-acme-tsig/acme.conf
```
## Runbook: stale secondary
If `ns2` serves stale records while SOA serials match (e.g. after a manual zone
edit that didn't bump the serial as expected), force a fresh transfer on `ns2`:
```
knotc zone-retransfer <zone>
```
Watch the **CNX DNS** Grafana dashboard: the per-nameserver SOA serial table
should agree across `ns1`/`ns2`, and "seconds until zone expiry" on the secondary
should reset on each successful transfer rather than counting toward zero.