5 Commits

Author SHA1 Message Date
Berwn 9c8a2abf3f Bind VictoriaLogs on IPv6 so the mesh can ship journald to it
VictoriaLogs, like the VM scraper, is IPv4-only by default: ":9428" binds
0.0.0.0 only, so ns1/ns2 pushing journald over the IPv6 mesh got "connection
refused" while control's own loopback (v4) upload worked. Add -enableTCP6 so it
binds [::] (dual-stack), matching the flag already used for the scraper.

Also simplify the systemd-journal-upload override to just startLimitIntervalSec=0
(retry forever / self-heal) and drop the SuccessExitStatus masking: a persistent
sink failure should stay loud rather than be hidden behind a green deploy.
2026-06-17 17:27:56 +07:00
Berwn 0eb883061b Keep systemd-journal-upload retrying instead of failing a deploy
The uploader exits when VictoriaLogs is unreachable. Upstream already sets
Restart=always/RestartSec=3sec, but the default start-rate limit lets the unit
give up permanently and trip switch-to-configuration when the sink is briefly
down. Disable the limit (startLimitIntervalSec=0) so logging stays best-effort
and never wedges a host or a deploy.
2026-06-17 17:09:30 +07:00
Berwn d4a171640b Add VictoriaLogs for centralized journald across all hosts
control runs VictoriaLogs (:9428, 30d, mesh-scoped) with a matching
Grafana datasource. Each host ships journald via systemd's own
journald.upload to the /insert/journald endpoint -- no extra agent.
control uploads over loopback so its logs survive a mesh outage; ns1
and ns2 push over the mesh.
2026-06-17 16:53:52 +07:00
Berwn 848c4ec47d Read mesh host map from clan zerotier vars instead of hardcoding
The control/ns1/ns2 mesh IPs and the /88 subnet were duplicated literals in
mesh-hosts.nix. clan-core's zerotier generator already writes each machine's IP
as a public var (vars/per-machine/<m>/zerotier/zerotier-ip), so read from there
and derive the subnet from zerotier-network-id. Pure refactor: the rendered
values are identical and the system derivation hash is unchanged.
2026-06-17 11:53:56 +07:00
Berwn 33ac7e106b Add VictoriaMetrics + Grafana DNS monitoring over the mesh
control runs VictoriaMetrics (loopback) and Grafana; every machine exports
node metrics and the nameservers export Knot stats (mod-stats + knot-exporter).
Scraping and the Grafana UI ride the ZeroTier mesh only, scoped by nftables to
the mesh /88; the public side stays closed by the Hetzner cloud firewall. The
provisioned DNS dashboard includes a per-zone SOA serial table to catch
primary/secondary drift. ZeroTier ULAs are centralised in mesh-hosts.nix.
2026-06-17 10:17:27 +07:00