10 Commits

Author SHA1 Message Date
Berwn 60db8c60b0 Add parsedmarc DMARC report analyzer on control
Deliver cnx.email DMARC aggregate/forensic reports to a dedicated dmarc@cnx.email
mailbox on mx1 and analyze them with parsedmarc on control, storing parsed
reports in a local loopback Elasticsearch and visualizing via the auto-provisioned
Grafana dashboard. parsedmarc fetches the mailbox over IMAPS across the mesh
(mx1.cnx.email pinned to its mesh address so TLS still validates), using a shared
mail-dmarc-cred clan var so mx1's mailserver and control see the same password.
2026-06-21 03:27:23 +07:00
Berwn b8bea27a9c Update runbook docs for web01 reverse proxy and per-host ACME keys
Reflect web01 in the machines table and monitoring scrape list, note Grafana is
now also published publicly via web01's reverse proxy, add the CNX Uptime
dashboard, and document the dedicated acme_mx1/acme_web01 DNS-01 keys.
2026-06-21 03:17:51 +07:00
Berwn dc21348727 Format drifted files to satisfy the treefmt flake-check gate
Pure formatting (nixfmt/prettier/yamlfmt); no behavior change. These
files predate the current treefmt config and were failing nix flake
check; reformatting them makes the gate green again.
2026-06-18 14:49:48 +07:00
Berwn a4fe2a7b3a Document how to pull registrar DS records from Knot on ns1
Explain that key material is auto-managed in the KASP keystore under
/var/lib/knot, and that the registrar DS is generated per zone with
`sudo -u knot keymgr <zone> ds`.
2026-06-18 12:12:10 +07:00
Berwn 6e4178df04 Onboard mx1 mail host and factor out per-host public IPs
- Register mx1 in the inventory and as a direct-SSH `internet` host; give it
  a static public IPv6 (2a01:4ff:2f0:1963::1).
- Point the cnx.email MX (plus SPF/DMARC) at mx1 and add its A record.
- Bring mx1 into monitoring: import exporters, add it to the mesh map and the
  node scrape job so its host metrics and journald reach control.
- Add a clan-mx1 Hetzner firewall: inbound SMTP + ZeroTier + ICMP, no public
  SSH (admin rides the mesh like the other hosts). 587/465/993 held for now.
- Extract per-host public IPv4/IPv6 into modules/hosts.nix, consumed by
  clan.nix's internet hosts and each machine's cnx.staticIPv6, so each address
  is declared once instead of being duplicated across configs.
- docs: add mx1 to the machines table.
2026-06-18 11:53:14 +07:00
Berwn 9c8a2abf3f Bind VictoriaLogs on IPv6 so the mesh can ship journald to it
VictoriaLogs, like the VM scraper, is IPv4-only by default: ":9428" binds
0.0.0.0 only, so ns1/ns2 pushing journald over the IPv6 mesh got "connection
refused" while control's own loopback (v4) upload worked. Add -enableTCP6 so it
binds [::] (dual-stack), matching the flag already used for the scraper.

Also simplify the systemd-journal-upload override to just startLimitIntervalSec=0
(retry forever / self-heal) and drop the SuccessExitStatus masking: a persistent
sink failure should stay loud rather than be hidden behind a green deploy.
2026-06-17 17:27:56 +07:00
Berwn d4a171640b Add VictoriaLogs for centralized journald across all hosts
control runs VictoriaLogs (:9428, 30d, mesh-scoped) with a matching
Grafana datasource. Each host ships journald via systemd's own
journald.upload to the /insert/journald endpoint -- no extra agent.
control uploads over loopback so its logs survive a mesh outage; ns1
and ns2 push over the mesh.
2026-06-17 16:53:52 +07:00
Berwn 54f607d063 Add blackbox exporter for outside-in DNS probes
control runs blackbox_exporter on loopback, probing each nameserver's
public v4+v6 address for every zone: SOA (zone served) and DNSKEY (still
signed, since blackbox has no DO-bit option). Probe definitions are
shared between the exporter config and the VictoriaMetrics scrape jobs
so they can't drift. Verified live against ns1/ns2 over v4 and v6.
2026-06-17 15:37:45 +07:00
Berwn 1ea5bda23f Add CNX Backups dashboard and document the backup setup
Grafana dashboard (auto-provisioned from the dashboards dir) tracks
borgbackup job health, time since last run, and per-job systemd state
from the node_exporter systemd collector on the client. New docs page
covers the ns1 -> control topology, secrets flow, and restore commands.
2026-06-17 15:13:47 +07:00
Berwn a7d4c0e567 Add mdBook infra runbook served by Caddy on control
Docs live in docs/ (DNS, ZeroTier mesh, monitoring), built at Nix-build time and
served as static files over the ZeroTier mesh on control:8080. Commit-to-edit:
change the markdown and redeploy to publish.
2026-06-17 14:26:21 +07:00