cnx-network-clan

Author	SHA1	Message	Date
Berwn	86a2928825	update(inventory.json): Installed web01	2026-06-21 02:28:43 +07:00
Berwn	f6da01ba18	Add web01 to secret vars/shared/dns-acme-web01-secret/secret	2026-06-21 02:26:44 +07:00
Berwn	eeed40bcb5	Update vars via generator dns-acme-web01-rfc2136 for machine web01	2026-06-21 02:26:44 +07:00
Berwn	aac8f9d8e6	Update vars via generator dns-acme-web01-knot for machine ns1	2026-06-21 02:26:43 +07:00
Berwn	f5874bc337	Update vars via generator zerotier for machine web01	2026-06-21 02:26:33 +07:00
Berwn	2481d4bf92	Update vars via generator tor_tor for machine web01	2026-06-21 02:26:32 +07:00
Berwn	2d8096ee57	Update vars via generator state-version for machine web01	2026-06-21 02:26:30 +07:00
Berwn	1a4a749d78	Update vars via generator root-password for machine web01	2026-06-21 02:26:30 +07:00
Berwn	1c779d8013	Update vars via generator openssh for machine web01	2026-06-21 02:26:30 +07:00
Berwn	9c4e036b09	Update vars via generator emergency-access for machine web01	2026-06-21 02:26:30 +07:00
Berwn	8139b91fbc	Add machine web01 to secrets	2026-06-21 02:26:30 +07:00
Berwn	c436389619	Update secret web01-age.key	2026-06-21 02:26:29 +07:00
Berwn	9fc97e65b2	Update vars via generator dns-acme-web01-secret for machine ns1	2026-06-21 02:26:29 +07:00
Berwn	bd84bf7c85	Set disk schema of machine: web01 to single-disk	2026-06-21 02:25:24 +07:00
Berwn	848dc0dff7	machines/web01/facter.json: update hardware configuration	2026-06-21 02:23:00 +07:00
Berwn	95aff44f86	Add machine web01	2026-06-21 01:58:59 +07:00
Berwn	f42569e992	Add provisioned Grafana uptime dashboard for all hosts	2026-06-21 01:57:08 +07:00
Berwn	1dd3aadb97	Add mail.cnx.email client alias as a cert SAN A mail.cnx.email CNAME (-> mx1.cnx.email) lets clients (Thunderbird etc.) use a friendly hostname for submission/IMAP. To avoid a TLS name mismatch the cert now carries mail.cnx.email as a SAN, so the acme_mx1 key is authorized to write _acme-challenge.mail too. The MX still points at mx1.cnx.email and --reuse-key keeps the DANE TLSA digest valid across the re-issue.	2026-06-18 15:01:03 +07:00
Berwn	dc21348727	Format drifted files to satisfy the treefmt flake-check gate Pure formatting (nixfmt/prettier/yamlfmt); no behavior change. These files predate the current treefmt config and were failing nix flake check; reformatting them makes the gate green again.	2026-06-18 14:49:48 +07:00
Berwn	1cb6f39ea2	Add declarative SNM mail stack on mx1 with DNS-01, DANE, MTA-STS mx1 runs Simple NixOS Mailserver (Postfix/Dovecot/Rspamd/OpenDKIM) for cnx.email. The TLS cert is obtained via ACME DNS-01 using a dedicated, scoped TSIG key (acme_mx1) that ns1 authorizes for only _acme-challenge.mx1 and _acme-challenge.mta-sts on the cnx.email zone, so the credential can write nothing else. Mailbox passwords are auto-minted by a clan vars generator (four-word passphrase + number). DANE TLSA (3 1 1) is published for _25._tcp.mx1; --reuse-key keeps the key digest stable across renewals. MTA-STS is enforced via a Caddy vhost serving the policy on :443 from the same cert (mta-sts SAN). Firewall opens 25/587/465/143/993/443; 80 stays closed.	2026-06-18 14:47:20 +07:00
Berwn	026a26dd53	Add ns1 to secret vars/shared/dns-acme-mx1-secret/secret	2026-06-18 14:11:40 +07:00
Berwn	7e5d50b260	Update vars via generator dns-acme-mx1-knot for machine ns1	2026-06-18 14:11:40 +07:00
Berwn	312de984c1	Update vars via generator dns-acme-rfc2136 for machine mx1	2026-06-18 14:11:40 +07:00
Berwn	d76aa8cc8d	Update vars via generator mail-passwd-postmaster-at-cnx-email for machine mx1	2026-06-18 14:11:36 +07:00
Berwn	0a78cad06e	Update vars via generator dns-acme-mx1-secret for machine mx1	2026-06-18 14:11:36 +07:00
Berwn	d1b24017aa	Use no-store for docs: epoch mtimes make revalidation serve stale	2026-06-18 12:24:38 +07:00
Berwn	77a18df257	Stop browsers serving stale docs by forcing revalidation	2026-06-18 12:19:42 +07:00
Berwn	a4fe2a7b3a	Document how to pull registrar DS records from Knot on ns1 Explain that key material is auto-managed in the KASP keystore under /var/lib/knot, and that the registrar DS is generated per zone with `sudo -u knot keymgr <zone> ds`.	2026-06-18 12:12:10 +07:00
Berwn	6e4178df04	Onboard mx1 mail host and factor out per-host public IPs - Register mx1 in the inventory and as a direct-SSH `internet` host; give it a static public IPv6 (2a01:4ff:2f0:1963::1). - Point the cnx.email MX (plus SPF/DMARC) at mx1 and add its A record. - Bring mx1 into monitoring: import exporters, add it to the mesh map and the node scrape job so its host metrics and journald reach control. - Add a clan-mx1 Hetzner firewall: inbound SMTP + ZeroTier + ICMP, no public SSH (admin rides the mesh like the other hosts). 587/465/993 held for now. - Extract per-host public IPv4/IPv6 into modules/hosts.nix, consumed by clan.nix's internet hosts and each machine's cnx.staticIPv6, so each address is declared once instead of being duplicated across configs. - docs: add mx1 to the machines table.	2026-06-18 11:53:14 +07:00
Berwn	2c89ab913c	update(inventory.json): Installed mx1	2026-06-18 11:35:22 +07:00
Berwn	84c3eece58	Update vars via generator zerotier for machine mx1	2026-06-18 11:33:06 +07:00
Berwn	7f5227d2e2	Update vars via generator tor_tor for machine mx1	2026-06-18 11:33:06 +07:00
Berwn	ebf4efe5c9	Update vars via generator state-version for machine mx1	2026-06-18 11:33:04 +07:00
Berwn	64b7eb1934	Update vars via generator root-password for machine mx1	2026-06-18 11:33:04 +07:00
Berwn	e763d76ae9	Update vars via generator openssh for machine mx1	2026-06-18 11:33:03 +07:00
Berwn	b65f526ea2	Update vars via generator emergency-access for machine mx1	2026-06-18 11:33:03 +07:00
Berwn	3a0bc2dba4	Add machine mx1 to secrets	2026-06-18 11:33:03 +07:00
Berwn	6098fe9a3b	Update secret mx1-age.key	2026-06-18 11:33:03 +07:00
Berwn	8d9981ee5a	Set disk schema of machine: mx1 to single-disk	2026-06-18 11:32:33 +07:00
Berwn	afc2e997c0	machines/mx1/facter.json: update hardware configuration	2026-06-18 11:32:22 +07:00
Berwn	faaa7b66c0	Add machine mx1	2026-06-18 11:21:27 +07:00
Berwn	9c8a2abf3f	Bind VictoriaLogs on IPv6 so the mesh can ship journald to it VictoriaLogs, like the VM scraper, is IPv4-only by default: ":9428" binds 0.0.0.0 only, so ns1/ns2 pushing journald over the IPv6 mesh got "connection refused" while control's own loopback (v4) upload worked. Add -enableTCP6 so it binds [::] (dual-stack), matching the flag already used for the scraper. Also simplify the systemd-journal-upload override to just startLimitIntervalSec=0 (retry forever / self-heal) and drop the SuccessExitStatus masking: a persistent sink failure should stay loud rather than be hidden behind a green deploy.	2026-06-17 17:27:56 +07:00
Berwn	0eb883061b	Keep systemd-journal-upload retrying instead of failing a deploy The uploader exits when VictoriaLogs is unreachable. Upstream already sets Restart=always/RestartSec=3sec, but the default start-rate limit lets the unit give up permanently and trip switch-to-configuration when the sink is briefly down. Disable the limit (startLimitIntervalSec=0) so logging stays best-effort and never wedges a host or a deploy.	2026-06-17 17:09:30 +07:00
Berwn	d4a171640b	Add VictoriaLogs for centralized journald across all hosts control runs VictoriaLogs (:9428, 30d, mesh-scoped) with a matching Grafana datasource. Each host ships journald via systemd's own journald.upload to the /insert/journald endpoint -- no extra agent. control uploads over loopback so its logs survive a mesh outage; ns1 and ns2 push over the mesh.	2026-06-17 16:53:52 +07:00
Berwn	c7b0f206c8	Alert on and chart blackbox DNS probe failures DNSResolutionProbeFailed and DNSSECProbeFailed fire when an SOA or DNSKEY probe to a public nameserver address stays down for 5m. The CNX DNS dashboard gains a "DNS probes (outside-in)" row: per-zone/server status table, probe success, and probe latency.	2026-06-17 15:42:13 +07:00
Berwn	54f607d063	Add blackbox exporter for outside-in DNS probes control runs blackbox_exporter on loopback, probing each nameserver's public v4+v6 address for every zone: SOA (zone served) and DNSKEY (still signed, since blackbox has no DO-bit option). Probe definitions are shared between the exporter config and the VictoriaMetrics scrape jobs so they can't drift. Verified live against ns1/ns2 over v4 and v6.	2026-06-17 15:37:45 +07:00
Berwn	0544bf95e5	Add vmalert rules for failed and stale backups BackupJobFailed fires when a borgbackup job enters the systemd failed state; BackupStale fires when the daily timer has not run in over 26h (or has never run). Both read the node_exporter systemd collector on the backup client, matching the CNX Backups dashboard.	2026-06-17 15:17:12 +07:00
Berwn	1ea5bda23f	Add CNX Backups dashboard and document the backup setup Grafana dashboard (auto-provisioned from the dashboards dir) tracks borgbackup job health, time since last run, and per-job systemd state from the node_exporter systemd collector on the client. New docs page covers the ns1 -> control topology, secrets flow, and restore commands.	2026-06-17 15:13:47 +07:00
Berwn	ed746b58c3	Update vars via generator borgbackup for machine ns1	2026-06-17 15:07:13 +07:00
Berwn	044891927b	Back up Knot DNSSEC keystore from ns1 to control via borgbackup clan borgbackup instance: control serves repos, ns1 backs up its clan.core.state (the KASP keystore at /var/lib/knot) nightly over the mesh with repokey encryption. ns1 maps the control machine name to its ZeroTier address so the borg@control repo resolves. Run `clan vars generate ns1` before deploy to mint the borg keypair.	2026-06-17 15:06:58 +07:00

1 2 3

123 Commits