- Implement custom Python alerting system (src/alerting.py) with water level monitoring, data freshness checks, and Matrix notifications - Add complete Grafana Matrix alerting setup guide (docs/GRAFANA_MATRIX_SETUP.md) with webhook configuration, alert rules, and notification policies - Create Matrix quick start guide (docs/MATRIX_QUICK_START.md) for rapid deployment - Integrate alerting commands into main application (--alert-check, --alert-test) - Add Matrix configuration to environment variables (.env.example) - Update Makefile with alerting targets (alert-check, alert-test) - Enhance status command to show Matrix notification status - Support station-specific water level thresholds and escalation rules - Provide dual alerting approach: native Grafana alerts and custom Python system 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
168 lines
3.7 KiB
Markdown
168 lines
3.7 KiB
Markdown
# Grafana Matrix Alerting Setup
|
|
|
|
## Overview
|
|
Configure Grafana to send water level alerts directly to Matrix channels when thresholds are exceeded.
|
|
|
|
## Prerequisites
|
|
- Grafana instance with your PostgreSQL data source
|
|
- Matrix account and access token
|
|
- Matrix room for alerts
|
|
|
|
## Step 1: Configure Matrix Contact Point
|
|
|
|
1. **In Grafana, go to Alerting → Contact Points**
|
|
2. **Add new contact point:**
|
|
```
|
|
Name: matrix-water-alerts
|
|
Integration: Webhook
|
|
URL: https://matrix.org/_matrix/client/v3/rooms/!ROOM_ID:matrix.org/send/m.room.message
|
|
HTTP Method: POST
|
|
```
|
|
|
|
3. **Add Headers:**
|
|
```
|
|
Authorization: Bearer YOUR_MATRIX_ACCESS_TOKEN
|
|
Content-Type: application/json
|
|
```
|
|
|
|
4. **Message Template:**
|
|
```json
|
|
{
|
|
"msgtype": "m.text",
|
|
"body": "🌊 WATER ALERT: {{ .CommonLabels.alertname }}\n\nStation: {{ .CommonLabels.station_code }}\nLevel: {{ .CommonAnnotations.water_level }}m\nStatus: {{ .CommonLabels.severity }}\n\nTime: {{ .CommonAnnotations.time }}"
|
|
}
|
|
```
|
|
|
|
## Step 2: Create Alert Rules
|
|
|
|
### High Water Level Alert
|
|
```yaml
|
|
Rule Name: high-water-level
|
|
Query: water_level > 6.0
|
|
Condition: IS ABOVE 6.0 FOR 5m
|
|
Labels:
|
|
- severity: critical
|
|
- station_code: {{ .station_code }}
|
|
Annotations:
|
|
- water_level: {{ .water_level }}
|
|
- summary: "Critical water level at {{ .station_code }}"
|
|
```
|
|
|
|
### Low Water Level Alert
|
|
```yaml
|
|
Rule Name: low-water-level
|
|
Query: water_level < 1.0
|
|
Condition: IS BELOW 1.0 FOR 10m
|
|
Labels:
|
|
- severity: warning
|
|
- station_code: {{ .station_code }}
|
|
```
|
|
|
|
### Data Gap Alert
|
|
```yaml
|
|
Rule Name: data-gap
|
|
Query: increase(measurements_total[1h]) == 0
|
|
Condition: IS EQUAL TO 0 FOR 30m
|
|
Labels:
|
|
- severity: warning
|
|
- issue: data-gap
|
|
```
|
|
|
|
## Step 3: Matrix Setup
|
|
|
|
### Get Matrix Access Token
|
|
```bash
|
|
curl -X POST https://matrix.org/_matrix/client/v3/login \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"type": "m.login.password",
|
|
"user": "your_username",
|
|
"password": "your_password"
|
|
}'
|
|
```
|
|
|
|
### Create Alert Room
|
|
```bash
|
|
curl -X POST "https://matrix.org/_matrix/client/v3/createRoom" \
|
|
-H "Authorization: Bearer YOUR_ACCESS_TOKEN" \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"name": "Water Level Alerts - Northern Thailand",
|
|
"topic": "Automated alerts for Ping River water monitoring",
|
|
"preset": "trusted_private_chat"
|
|
}'
|
|
```
|
|
|
|
## Example Alert Queries
|
|
|
|
### Critical Water Levels
|
|
```promql
|
|
# High water alert
|
|
water_level{station_code=~"P.1|P.4A|P.20"} > 6.0
|
|
|
|
# Dangerous discharge
|
|
discharge{station_code=~".*"} > 500
|
|
|
|
# Rapid level change
|
|
increase(water_level[15m]) > 0.5
|
|
```
|
|
|
|
### System Health
|
|
```promql
|
|
# No data received
|
|
up{job="water-monitor"} == 0
|
|
|
|
# Old data
|
|
(time() - timestamp) > 7200
|
|
```
|
|
|
|
## Alert Notification Format
|
|
|
|
Your Matrix messages will look like:
|
|
```
|
|
🌊 WATER ALERT: High Water Level
|
|
|
|
Station: P.1 (Chiang Mai)
|
|
Level: 6.2m (CRITICAL)
|
|
Discharge: 450 cms
|
|
Status: DANGER
|
|
|
|
Time: 2025-09-26 14:30:00
|
|
Trend: Rising (+0.3m in 30min)
|
|
|
|
📍 Location: 18.7883°N, 98.9853°E
|
|
```
|
|
|
|
## Advanced Features
|
|
|
|
### Escalation Rules
|
|
```yaml
|
|
# Send to different rooms based on severity
|
|
- if: severity == "critical"
|
|
receiver: matrix-emergency
|
|
- if: severity == "warning"
|
|
receiver: matrix-alerts
|
|
- if: time_of_day() outside "08:00-20:00"
|
|
receiver: matrix-night-duty
|
|
```
|
|
|
|
### Rate Limiting
|
|
```yaml
|
|
group_wait: 5m
|
|
group_interval: 10m
|
|
repeat_interval: 30m
|
|
```
|
|
|
|
## Testing Alerts
|
|
|
|
1. **Test Contact Point** - Use Grafana's test button
|
|
2. **Simulate Alert** - Manually trigger with test data
|
|
3. **Verify Matrix** - Check message formatting and delivery
|
|
|
|
## Troubleshooting
|
|
|
|
### Common Issues
|
|
- **403 Forbidden**: Check Matrix access token
|
|
- **Room not found**: Verify room ID format
|
|
- **No alerts**: Check query syntax and thresholds
|
|
- **Spam**: Configure proper grouping and intervals |