Update runbook docs for web01 reverse proxy and per-host ACME keys
Reflect web01 in the machines table and monitoring scrape list, note Grafana is now also published publicly via web01's reverse proxy, add the CNX Uptime dashboard, and document the dedicated acme_mx1/acme_web01 DNS-01 keys.
This commit is contained in:
+18
-6
@@ -61,13 +61,25 @@ requires re-submitting the DS.
|
|||||||
|
|
||||||
## ACME DNS-01
|
## ACME DNS-01
|
||||||
|
|
||||||
A dedicated TSIG key (`acme_ddns`), scoped by `acl_acme` to `TXT` updates at or
|
Certificates are issued by `_acme-challenge` TXT updates that `ns1` accepts over
|
||||||
under `_acme-challenge.<zone>` on `ns1` only. Knot signs the record and transfers
|
TSIG, signs, and transfers to `ns2` (which never needs these keys). Each consumer
|
||||||
it to `ns2`, which never needs this key. Retrieve the client config with:
|
gets its **own** key, scoped by an ACL to exactly the owner names it needs and
|
||||||
|
attached only to the zone it lives in — so a leaked key can write nothing but its
|
||||||
|
own challenges.
|
||||||
|
|
||||||
```
|
- **`acme_ddns`** (`acl_acme`) — the general key, scoped to `TXT` at or under
|
||||||
clan vars get ns1 dns-acme-tsig/acme.conf
|
`_acme-challenge.<zone>` and attached to every zone. Client config:
|
||||||
```
|
```
|
||||||
|
clan vars get ns1 dns-acme-tsig/acme.conf
|
||||||
|
```
|
||||||
|
- **`acme_mx1`** (`acl_acme_mx1`) — held only by `mx1`, scoped to
|
||||||
|
`_acme-challenge.{mx1,mta-sts,mail}` and attached only to `cnx.email` (the mail
|
||||||
|
cert plus its MTA-STS and client-alias SANs). Secret shared via the
|
||||||
|
`dns-acme-mx1-secret` generator.
|
||||||
|
- **`acme_web01`** (`acl_acme_web01`) — held only by `web01`, scoped to
|
||||||
|
`_acme-challenge` and attached only to `cnx.network` (where the wildcard
|
||||||
|
`*.cnx.network` challenge lands, at the apex). Secret shared via the
|
||||||
|
`dns-acme-web01-secret` generator.
|
||||||
|
|
||||||
## Runbook: stale secondary
|
## Runbook: stale secondary
|
||||||
|
|
||||||
|
|||||||
+13
-7
@@ -1,6 +1,7 @@
|
|||||||
# Monitoring
|
# Monitoring
|
||||||
|
|
||||||
Metrics and dashboards live on `control`, reachable only over the ZeroTier mesh.
|
Metrics and logs live on `control` over the ZeroTier mesh; the Grafana dashboards
|
||||||
|
are also published publicly through `web01` (see [Dashboards](#dashboards)).
|
||||||
|
|
||||||
## Collection
|
## Collection
|
||||||
|
|
||||||
@@ -18,8 +19,8 @@ Metrics and dashboards live on `control`, reachable only over the ZeroTier mesh.
|
|||||||
## Storage & scraping
|
## Storage & scraping
|
||||||
|
|
||||||
**VictoriaMetrics** on `control`, bound to `127.0.0.1:8428`, 180-day retention
|
**VictoriaMetrics** on `control`, bound to `127.0.0.1:8428`, 180-day retention
|
||||||
(`modules/monitoring/server.nix`). It scrapes `control` over loopback and `ns1`/
|
(`modules/monitoring/server.nix`). It scrapes `control` over loopback and
|
||||||
`ns2` over the mesh.
|
`ns1`/`ns2`/`mx1`/`web01` over the mesh.
|
||||||
|
|
||||||
> The scraper dials IPv4-only by default, so mesh (IPv6) targets need
|
> The scraper dials IPv4-only by default, so mesh (IPv6) targets need
|
||||||
> `extraOptions = [ "-enableTCP6" ]`. Without it, ns1/ns2 are dropped with
|
> `extraOptions = [ "-enableTCP6" ]`. Without it, ns1/ns2 are dropped with
|
||||||
@@ -31,8 +32,10 @@ Metrics and dashboards live on `control`, reachable only over the ZeroTier mesh.
|
|||||||
|
|
||||||
## Dashboards
|
## Dashboards
|
||||||
|
|
||||||
**Grafana** on `control` (`:3000`), mesh-only, anonymous access disabled. The
|
**Grafana** on `control` (`:3000`), anonymous access disabled. Reachable directly
|
||||||
admin password is a clan var:
|
over the mesh, and publicly at `https://grafana.cnx.network` via `web01`'s reverse
|
||||||
|
proxy (TLS termination — see [Overview](./overview.md)). The admin password is a
|
||||||
|
clan var:
|
||||||
|
|
||||||
```
|
```
|
||||||
clan vars get control grafana-admin/password
|
clan vars get control grafana-admin/password
|
||||||
@@ -46,6 +49,9 @@ there is picked up):
|
|||||||
outside-in DNS probes.
|
outside-in DNS probes.
|
||||||
- **CNX Backups** (`backups.json`) — borgbackup job health, time since the last
|
- **CNX Backups** (`backups.json`) — borgbackup job health, time since the last
|
||||||
run, and per-job state. See [Backups](./backups.md).
|
run, and per-job state. See [Backups](./backups.md).
|
||||||
|
- **CNX Uptime** (`uptime.json`) — per-host up/down status, current uptime,
|
||||||
|
availability over the selected window, and up/down history. Label-driven, so
|
||||||
|
every scraped host appears automatically.
|
||||||
|
|
||||||
## Logs
|
## Logs
|
||||||
|
|
||||||
@@ -53,8 +59,8 @@ there is picked up):
|
|||||||
(`modules/monitoring/server.nix`). All three hosts ship journald to it via
|
(`modules/monitoring/server.nix`). All three hosts ship journald to it via
|
||||||
systemd's own `services.journald.upload` → the `/insert/journald` endpoint
|
systemd's own `services.journald.upload` → the `/insert/journald` endpoint
|
||||||
(`modules/monitoring/exporters.nix`); no extra agent. `control` uploads over
|
(`modules/monitoring/exporters.nix`); no extra agent. `control` uploads over
|
||||||
loopback so its logs survive a mesh outage, `ns1`/`ns2` push over the mesh, and
|
loopback so its logs survive a mesh outage, the other hosts push over the mesh,
|
||||||
9428 is firewall-scoped to the mesh like everything else.
|
and 9428 is firewall-scoped to the mesh like everything else.
|
||||||
|
|
||||||
> Same IPv4-only default as the scraper: VictoriaLogs binds `0.0.0.0:9428` for a
|
> Same IPv4-only default as the scraper: VictoriaLogs binds `0.0.0.0:9428` for a
|
||||||
> bare `:9428`, so mesh (IPv6) pushes from ns1/ns2 are refused until you pass
|
> bare `:9428`, so mesh (IPv6) pushes from ns1/ns2 are refused until you pass
|
||||||
|
|||||||
@@ -6,12 +6,13 @@ this book is built from `docs/` and served on `control` over the ZeroTier mesh.
|
|||||||
|
|
||||||
## Machines
|
## Machines
|
||||||
|
|
||||||
| Machine | Role | Public IPv4 | Public IPv6 |
|
| Machine | Role | Public IPv4 | Public IPv6 |
|
||||||
| --------- | ------------------------------------- | ---------------- | ----------------------- |
|
| --------- | -------------------------------------- | ---------------- | ----------------------- |
|
||||||
| `control` | ZeroTier controller, monitoring, docs | `77.42.68.181` | `2a01:4f9:c013:e6d0::1` |
|
| `control` | ZeroTier controller, monitoring, docs | `77.42.68.181` | `2a01:4f9:c013:e6d0::1` |
|
||||||
| `ns1` | Knot DNS **primary** (master) | `46.224.170.206` | `2a01:4f8:c014:b5c5::1` |
|
| `ns1` | Knot DNS **primary** (master) | `46.224.170.206` | `2a01:4f8:c014:b5c5::1` |
|
||||||
| `ns2` | Knot DNS **secondary** (slave) | `157.180.70.82` | `2a01:4f9:c014:6d87::1` |
|
| `ns2` | Knot DNS **secondary** (slave) | `157.180.70.82` | `2a01:4f9:c014:6d87::1` |
|
||||||
| `mx1` | Mail server (**MX** for cnx.email) | `5.223.65.38` | `2a01:4ff:2f0:1963::1` |
|
| `mx1` | Mail server (**MX** for cnx.email) | `5.223.65.38` | `2a01:4ff:2f0:1963::1` |
|
||||||
|
| `web01` | Public reverse proxy (TLS termination) | `5.223.55.246` | `2a01:4ff:2f0:2d8f::1` |
|
||||||
|
|
||||||
## Access
|
## Access
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user