Commit Graph

22 Commits

Author SHA1 Message Date
Berwn 4c7c74836d Add vmalert alerting rules for DNS and host health
vmalert on control evaluates rules (declared in git) against VictoriaMetrics and
remote-writes alert state back, so firing alerts show as the ALERTS series in
Grafana. Covers SOA divergence between ns1/ns2, secondary zone expiry, scrape
target down, and root disk full. No notifier yet (notifier.blackhole). Also adds
TODO.md roadmap.
2026-06-17 14:49:32 +07:00
Berwn a7d4c0e567 Add mdBook infra runbook served by Caddy on control
Docs live in docs/ (DNS, ZeroTier mesh, monitoring), built at Nix-build time and
served as static files over the ZeroTier mesh on control:8080. Commit-to-edit:
change the markdown and redeploy to publish.
2026-06-17 14:26:21 +07:00
Berwn 33ac7e106b Add VictoriaMetrics + Grafana DNS monitoring over the mesh
control runs VictoriaMetrics (loopback) and Grafana; every machine exports
node metrics and the nameservers export Knot stats (mod-stats + knot-exporter).
Scraping and the Grafana UI ride the ZeroTier mesh only, scoped by nftables to
the mesh /88; the public side stays closed by the Hetzner cloud firewall. The
provisioned DNS dashboard includes a per-zone SOA serial table to catch
primary/secondary drift. ZeroTier ULAs are centralised in mesh-hosts.nix.
2026-06-17 10:17:27 +07:00
Berwn aa604bda9a Switch ns1 zone serial-policy to unixtime
dateserial (YYYYMMDDnn) only has a 2-digit same-day counter held in Knot's
journal; a journal reset restarted the counter and let ns1 mint a serial ns2
had already seen with older content, so ns2 never retransferred. unixtime is
strictly monotonic per reload, eliminating the shared-serial collision.
2026-06-16 18:59:45 +07:00
Berwn e795960dcf Configure static public IPv6 on control, ns1, ns2 2026-06-16 18:04:33 +07:00
Berwn de7d950596 Format tree with treefmt 2026-06-16 16:53:00 +07:00
kurogeek 3302b70485 clan.core.sops.defaultGroups to all machines 2026-06-16 16:46:55 +07:00
Berwn a3482face5 Allow ACME DNS-01 dynamic updates on ns1
Add a dedicated acme_ddns TSIG key (scoped to ns1 only) and an acl_acme rule
that limits it to TXT updates at or under _acme-challenge.<zone>. An external
ACME client can now write challenge records via RFC 2136; Knot signs them and
transfers to ns2, which never holds the key.
2026-06-14 17:12:17 +07:00
Berwn dc51cfbdb5 Enable DNSSEC and automatic SOA serials on the DNS zones
ns1 (primary) now signs every zone with an ECDSA P-256/SHA-256 policy and
manages the SOA serial itself: zonefile-load = difference-no-serial (with
journal-content = all) plus serial-policy = dateserial let records be edited
without bumping the serial by hand. ns2 needs no change; it transfers the
already-signed zone.

Also point the ns1/ns2 AAAA glue at the public Hetzner IPv6 addresses; they
previously pointed at unroutable ZeroTier mesh ULAs.
2026-06-14 16:27:30 +07:00
Berwn 5864054b00 Move Hetzner firewall rules into a separate data file
Extract the per-firewall rule data out of control's configuration into
modules/hetzner-firewall-rules.nix, imported like the DNS domains list.
The evaluated rules are unchanged.
2026-06-14 15:49:00 +07:00
Berwn 344f432640 Add Hetzner Cloud firewall auto-sync from clan config
control runs a oneshot on each deploy that creates each firewall if
missing and replaces its rules via the Hetzner API set_rules action,
using a Read/Write token stored as a clan secret. Public SSH is not
exposed; admin access rides the ZeroTier mesh, with emergency-access as
the console fallback.
2026-06-14 15:40:05 +07:00
Berwn 306a2cf61e Set per-machine timezones and enable NTP
control and ns2 use UTC+3 (Etc/GMT-3), ns1 uses UTC+1 (Etc/GMT-1) —
fixed offsets, no DST. Make systemd-timesyncd explicit on all three.
2026-06-14 15:02:34 +07:00
Berwn 807785cdab Add authoritative DNS on ns1/ns2 and finalize clan config
- Knot authoritative DNS: ns1 primary, ns2 secondary serving cnx.network,
  buildfor.life and cnx.email over TSIG-secured zone transfer (modules/dns)
- Knot listens publicly + over ZeroTier; firewall opens port 53
- Complete clan inventory: name/domain, admin SSH key, control as the
  zerotier controller, tor on all nixos machines
- Enable age yubikey/fido2-hmac secret plugins
2026-06-14 13:24:23 +07:00
Berwn a40c4d1800 Set disk schema of machine: ns2 to single-disk 2026-06-14 13:19:56 +07:00
Berwn 2a0bdc4a4b Set disk schema of machine: ns1 to single-disk 2026-06-14 13:19:44 +07:00
Berwn 840b3ca407 machines/ns2/facter.json: update hardware configuration 2026-06-14 13:18:41 +07:00
Berwn d757dc3c52 machines/ns1/facter.json: update hardware configuration 2026-06-14 13:16:11 +07:00
Berwn bf65146a62 Set disk schema of machine: control to single-disk 2026-06-14 12:29:39 +07:00
Berwn 8938637c28 machines/control/facter.json: update hardware configuration 2026-06-14 12:27:20 +07:00
Berwn 7d02499c0e Add machine ns2 2026-06-14 12:14:12 +07:00
Berwn bda1854376 Add machine ns1 2026-06-14 12:14:10 +07:00
Berwn a86525d37c Add machine control 2026-06-14 12:14:07 +07:00