Surfaces vmalert's firing ALERTS series as a table at the top of the dashboard,
so the minimal-delivery alerts are visible at a glance. Existing panels shift
down by one row.
vmalert on control evaluates rules (declared in git) against VictoriaMetrics and
remote-writes alert state back, so firing alerts show as the ALERTS series in
Grafana. Covers SOA divergence between ns1/ns2, secondary zone expiry, scrape
target down, and root disk full. No notifier yet (notifier.blackhole). Also adds
TODO.md roadmap.
Docs live in docs/ (DNS, ZeroTier mesh, monitoring), built at Nix-build time and
served as static files over the ZeroTier mesh on control:8080. Commit-to-edit:
change the markdown and redeploy to publish.
The control/ns1/ns2 mesh IPs and the /88 subnet were duplicated literals in
mesh-hosts.nix. clan-core's zerotier generator already writes each machine's IP
as a public var (vars/per-machine/<m>/zerotier/zerotier-ip), so read from there
and derive the subnet from zerotier-network-id. Pure refactor: the rendered
values are identical and the system derivation hash is unchanged.
control runs VictoriaMetrics (loopback) and Grafana; every machine exports
node metrics and the nameservers export Knot stats (mod-stats + knot-exporter).
Scraping and the Grafana UI ride the ZeroTier mesh only, scoped by nftables to
the mesh /88; the public side stays closed by the Hetzner cloud firewall. The
provisioned DNS dashboard includes a per-zone SOA serial table to catch
primary/secondary drift. ZeroTier ULAs are centralised in mesh-hosts.nix.
ns1 (primary) now signs every zone with an ECDSA P-256/SHA-256 policy and
manages the SOA serial itself: zonefile-load = difference-no-serial (with
journal-content = all) plus serial-policy = dateserial let records be edited
without bumping the serial by hand. ns2 needs no change; it transfers the
already-signed zone.
Also point the ns1/ns2 AAAA glue at the public Hetzner IPv6 addresses; they
previously pointed at unroutable ZeroTier mesh ULAs.
Extract the per-firewall rule data out of control's configuration into
modules/hetzner-firewall-rules.nix, imported like the DNS domains list.
The evaluated rules are unchanged.
control runs a oneshot on each deploy that creates each firewall if
missing and replaces its rules via the Hetzner API set_rules action,
using a Read/Write token stored as a clan secret. Public SSH is not
exposed; admin access rides the ZeroTier mesh, with emergency-access as
the console fallback.
knotd runs as the "knot" user, so the shared TSIG key file needs
owner/group knot — it was root-only and knot couldn't read it.
systemd-resolved's stub listener was holding port 53, so knot's
0.0.0.0@53 / ::@53 TCP bind failed. Disable the stub (resolution
still works via nss-resolve) to free the port.
- Knot authoritative DNS: ns1 primary, ns2 secondary serving cnx.network,
buildfor.life and cnx.email over TSIG-secured zone transfer (modules/dns)
- Knot listens publicly + over ZeroTier; firewall opens port 53
- Complete clan inventory: name/domain, admin SSH key, control as the
zerotier controller, tor on all nixos machines
- Enable age yubikey/fido2-hmac secret plugins