Bind VictoriaLogs on IPv6 so the mesh can ship journald to it
VictoriaLogs, like the VM scraper, is IPv4-only by default: ":9428" binds 0.0.0.0 only, so ns1/ns2 pushing journald over the IPv6 mesh got "connection refused" while control's own loopback (v4) upload worked. Add -enableTCP6 so it binds [::] (dual-stack), matching the flag already used for the scraper. Also simplify the systemd-journal-upload override to just startLimitIntervalSec=0 (retry forever / self-heal) and drop the SuccessExitStatus masking: a persistent sink failure should stay loud rather than be hidden behind a green deploy.
This commit is contained in:
@@ -56,6 +56,14 @@ systemd's own `services.journald.upload` → the `/insert/journald` endpoint
|
|||||||
loopback so its logs survive a mesh outage, `ns1`/`ns2` push over the mesh, and
|
loopback so its logs survive a mesh outage, `ns1`/`ns2` push over the mesh, and
|
||||||
9428 is firewall-scoped to the mesh like everything else.
|
9428 is firewall-scoped to the mesh like everything else.
|
||||||
|
|
||||||
|
> Same IPv4-only default as the scraper: VictoriaLogs binds `0.0.0.0:9428` for a
|
||||||
|
> bare `:9428`, so mesh (IPv6) pushes from ns1/ns2 are refused until you pass
|
||||||
|
> `extraOptions = [ "-enableTCP6" ]` (binds `[::]`). Verify the bind on `control`:
|
||||||
|
>
|
||||||
|
> ```
|
||||||
|
> ss -tlnp | grep 9428 # want [::]:9428, not 0.0.0.0:9428
|
||||||
|
> ```
|
||||||
|
|
||||||
Query logs from Grafana via the provisioned **VictoriaLogs** datasource (Explore
|
Query logs from Grafana via the provisioned **VictoriaLogs** datasource (Explore
|
||||||
view, LogsQL), or directly in the built-in UI at `http://[control]:9428/select/vmui`.
|
view, LogsQL), or directly in the built-in UI at `http://[control]:9428/select/vmui`.
|
||||||
Logs are tagged with `_HOSTNAME` and `_SYSTEMD_UNIT`, so to follow one service
|
Logs are tagged with `_HOSTNAME` and `_SYSTEMD_UNIT`, so to follow one service
|
||||||
|
|||||||
@@ -103,11 +103,12 @@ in
|
|||||||
"http://${dest}/insert/journald";
|
"http://${dest}/insert/journald";
|
||||||
};
|
};
|
||||||
|
|
||||||
# systemd-journal-upload exits if the sink is unreachable. The upstream module
|
# systemd-journal-upload exits if the sink is unreachable. Upstream already
|
||||||
# already sets Restart=always/RestartSec=3sec, but the default start-rate limit
|
# restarts it (Restart=always/RestartSec=3sec), but the default start-rate limit
|
||||||
# (5 tries / 10s) still lets the unit give up permanently and fail a deploy when
|
# (5 tries / 10s) lets it give up permanently — so a transient VictoriaLogs
|
||||||
# VictoriaLogs is briefly down. Logging is best-effort: disable the limit so it
|
# outage leaves the uploader dead until the next deploy. Disable the limit so it
|
||||||
# retries forever instead of wedging the host (or switch-to-configuration).
|
# retries forever and self-heals once the sink returns. (A persistent failure
|
||||||
|
# still surfaces loudly in a deploy, which is what we want.)
|
||||||
systemd.services.systemd-journal-upload.startLimitIntervalSec = 0;
|
systemd.services.systemd-journal-upload.startLimitIntervalSec = 0;
|
||||||
|
|
||||||
# Scrape ports reachable only from the ZeroTier mesh.
|
# Scrape ports reachable only from the ZeroTier mesh.
|
||||||
|
|||||||
@@ -69,7 +69,14 @@ in
|
|||||||
services.victorialogs = {
|
services.victorialogs = {
|
||||||
enable = true;
|
enable = true;
|
||||||
listenAddress = ":${toString logsPort}";
|
listenAddress = ":${toString logsPort}";
|
||||||
extraOptions = [ "-retentionPeriod=30d" ];
|
# -enableTCP6: like the scraper above, VictoriaLogs is IPv4-only by default
|
||||||
|
# for *listening* too — ":9428" binds 0.0.0.0 only, so ns1/ns2 pushing over
|
||||||
|
# the IPv6 mesh get "connection refused". This makes it bind [::] (dual-stack)
|
||||||
|
# so the mesh can reach it. Retention has no dedicated NixOS option.
|
||||||
|
extraOptions = [
|
||||||
|
"-retentionPeriod=30d"
|
||||||
|
"-enableTCP6"
|
||||||
|
];
|
||||||
};
|
};
|
||||||
|
|
||||||
# Admin password generated once and stored as a clan secret. Retrieve with:
|
# Admin password generated once and stored as a clan secret. Retrieve with:
|
||||||
|
|||||||
Reference in New Issue
Block a user