Update runbook docs for web01 reverse proxy and per-host ACME keys

Reflect web01 in the machines table and monitoring scrape list, note Grafana is
now also published publicly via web01's reverse proxy, add the CNX Uptime
dashboard, and document the dedicated acme_mx1/acme_web01 DNS-01 keys.
This commit is contained in:
Berwn
2026-06-21 03:17:51 +07:00
parent 415a050f6a
commit b8bea27a9c
3 changed files with 38 additions and 19 deletions
+18 -6
View File
@@ -61,13 +61,25 @@ requires re-submitting the DS.
## ACME DNS-01
A dedicated TSIG key (`acme_ddns`), scoped by `acl_acme` to `TXT` updates at or
under `_acme-challenge.<zone>` on `ns1` only. Knot signs the record and transfers
it to `ns2`, which never needs this key. Retrieve the client config with:
Certificates are issued by `_acme-challenge` TXT updates that `ns1` accepts over
TSIG, signs, and transfers to `ns2` (which never needs these keys). Each consumer
gets its **own** key, scoped by an ACL to exactly the owner names it needs and
attached only to the zone it lives in — so a leaked key can write nothing but its
own challenges.
```
clan vars get ns1 dns-acme-tsig/acme.conf
```
- **`acme_ddns`** (`acl_acme`) — the general key, scoped to `TXT` at or under
`_acme-challenge.<zone>` and attached to every zone. Client config:
```
clan vars get ns1 dns-acme-tsig/acme.conf
```
- **`acme_mx1`** (`acl_acme_mx1`) — held only by `mx1`, scoped to
`_acme-challenge.{mx1,mta-sts,mail}` and attached only to `cnx.email` (the mail
cert plus its MTA-STS and client-alias SANs). Secret shared via the
`dns-acme-mx1-secret` generator.
- **`acme_web01`** (`acl_acme_web01`) — held only by `web01`, scoped to
`_acme-challenge` and attached only to `cnx.network` (where the wildcard
`*.cnx.network` challenge lands, at the apex). Secret shared via the
`dns-acme-web01-secret` generator.
## Runbook: stale secondary
+13 -7
View File
@@ -1,6 +1,7 @@
# Monitoring
Metrics and dashboards live on `control`, reachable only over the ZeroTier mesh.
Metrics and logs live on `control` over the ZeroTier mesh; the Grafana dashboards
are also published publicly through `web01` (see [Dashboards](#dashboards)).
## Collection
@@ -18,8 +19,8 @@ Metrics and dashboards live on `control`, reachable only over the ZeroTier mesh.
## Storage & scraping
**VictoriaMetrics** on `control`, bound to `127.0.0.1:8428`, 180-day retention
(`modules/monitoring/server.nix`). It scrapes `control` over loopback and `ns1`/
`ns2` over the mesh.
(`modules/monitoring/server.nix`). It scrapes `control` over loopback and
`ns1`/`ns2`/`mx1`/`web01` over the mesh.
> The scraper dials IPv4-only by default, so mesh (IPv6) targets need
> `extraOptions = [ "-enableTCP6" ]`. Without it, ns1/ns2 are dropped with
@@ -31,8 +32,10 @@ Metrics and dashboards live on `control`, reachable only over the ZeroTier mesh.
## Dashboards
**Grafana** on `control` (`:3000`), mesh-only, anonymous access disabled. The
admin password is a clan var:
**Grafana** on `control` (`:3000`), anonymous access disabled. Reachable directly
over the mesh, and publicly at `https://grafana.cnx.network` via `web01`'s reverse
proxy (TLS termination — see [Overview](./overview.md)). The admin password is a
clan var:
```
clan vars get control grafana-admin/password
@@ -46,6 +49,9 @@ there is picked up):
outside-in DNS probes.
- **CNX Backups** (`backups.json`) — borgbackup job health, time since the last
run, and per-job state. See [Backups](./backups.md).
- **CNX Uptime** (`uptime.json`) — per-host up/down status, current uptime,
availability over the selected window, and up/down history. Label-driven, so
every scraped host appears automatically.
## Logs
@@ -53,8 +59,8 @@ there is picked up):
(`modules/monitoring/server.nix`). All three hosts ship journald to it via
systemd's own `services.journald.upload` → the `/insert/journald` endpoint
(`modules/monitoring/exporters.nix`); no extra agent. `control` uploads over
loopback so its logs survive a mesh outage, `ns1`/`ns2` push over the mesh, and
9428 is firewall-scoped to the mesh like everything else.
loopback so its logs survive a mesh outage, the other hosts push over the mesh,
and 9428 is firewall-scoped to the mesh like everything else.
> Same IPv4-only default as the scraper: VictoriaLogs binds `0.0.0.0:9428` for a
> bare `:9428`, so mesh (IPv6) pushes from ns1/ns2 are refused until you pass
+7 -6
View File
@@ -6,12 +6,13 @@ this book is built from `docs/` and served on `control` over the ZeroTier mesh.
## Machines
| Machine | Role | Public IPv4 | Public IPv6 |
| --------- | ------------------------------------- | ---------------- | ----------------------- |
| `control` | ZeroTier controller, monitoring, docs | `77.42.68.181` | `2a01:4f9:c013:e6d0::1` |
| `ns1` | Knot DNS **primary** (master) | `46.224.170.206` | `2a01:4f8:c014:b5c5::1` |
| `ns2` | Knot DNS **secondary** (slave) | `157.180.70.82` | `2a01:4f9:c014:6d87::1` |
| `mx1` | Mail server (**MX** for cnx.email) | `5.223.65.38` | `2a01:4ff:2f0:1963::1` |
| Machine | Role | Public IPv4 | Public IPv6 |
| --------- | -------------------------------------- | ---------------- | ----------------------- |
| `control` | ZeroTier controller, monitoring, docs | `77.42.68.181` | `2a01:4f9:c013:e6d0::1` |
| `ns1` | Knot DNS **primary** (master) | `46.224.170.206` | `2a01:4f8:c014:b5c5::1` |
| `ns2` | Knot DNS **secondary** (slave) | `157.180.70.82` | `2a01:4f9:c014:6d87::1` |
| `mx1` | Mail server (**MX** for cnx.email) | `5.223.65.38` | `2a01:4ff:2f0:1963::1` |
| `web01` | Public reverse proxy (TLS termination) | `5.223.55.246` | `2a01:4ff:2f0:2d8f::1` |
## Access