Initial commit: Northern Thailand Ping River Monitor v3.1.0
Some checks failed
Security & Dependency Updates / Dependency Security Scan (push) Successful in 29s
Security & Dependency Updates / Docker Security Scan (push) Failing after 53s
Security & Dependency Updates / License Compliance (push) Successful in 13s
Security & Dependency Updates / Check for Dependency Updates (push) Successful in 19s
Security & Dependency Updates / Code Quality Metrics (push) Successful in 11s
Security & Dependency Updates / Security Summary (push) Successful in 7s
Some checks failed
Security & Dependency Updates / Dependency Security Scan (push) Successful in 29s
Security & Dependency Updates / Docker Security Scan (push) Failing after 53s
Security & Dependency Updates / License Compliance (push) Successful in 13s
Security & Dependency Updates / Check for Dependency Updates (push) Successful in 19s
Security & Dependency Updates / Code Quality Metrics (push) Successful in 11s
Security & Dependency Updates / Security Summary (push) Successful in 7s
Features: - Real-time water level monitoring for Ping River Basin (16 stations) - Coverage from Chiang Dao to Nakhon Sawan in Northern Thailand - FastAPI web interface with interactive dashboard and station management - Multi-database support (SQLite, MySQL, PostgreSQL, InfluxDB, VictoriaMetrics) - Comprehensive monitoring with health checks and metrics collection - Docker deployment with Grafana integration - Production-ready architecture with enterprise-grade observability CI/CD & Automation: - Complete Gitea Actions workflows for CI/CD, security, and releases - Multi-Python version testing (3.9-3.12) - Multi-architecture Docker builds (amd64, arm64) - Daily security scanning and dependency monitoring - Automated documentation generation - Performance testing and validation Production Ready: - Type safety with Pydantic models and comprehensive type hints - Data validation layer with range checking and error handling - Rate limiting and request tracking for API protection - Enhanced logging with rotation, colors, and performance metrics - Station management API for dynamic CRUD operations - Comprehensive documentation and deployment guides Technical Stack: - Python 3.9+ with FastAPI and Pydantic - Multi-database architecture with adapter pattern - Docker containerization with multi-stage builds - Grafana dashboards for visualization - Gitea Actions for CI/CD automation - Enterprise monitoring and alerting Ready for deployment to B4L infrastructure!
This commit is contained in:
447
docs/DATABASE_DEPLOYMENT_GUIDE.md
Normal file
447
docs/DATABASE_DEPLOYMENT_GUIDE.md
Normal file
@@ -0,0 +1,447 @@
|
||||
# Database Deployment Guide for Thailand Water Monitor
|
||||
|
||||
This guide covers deployment options for storing water monitoring data in production environments.
|
||||
|
||||
## 🏆 Recommendation Summary
|
||||
|
||||
| Database | Best For | Performance | Complexity | Cost |
|
||||
|----------|----------|-------------|------------|------|
|
||||
| **InfluxDB** | Time-series data, dashboards | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐ |
|
||||
| **VictoriaMetrics** | High-performance metrics | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ |
|
||||
| **PostgreSQL** | Complex queries, reliability | ⭐⭐⭐⭐ | ⭐⭐ | ⭐⭐⭐⭐ |
|
||||
| **MySQL** | Familiar, existing infrastructure | ⭐⭐⭐ | ⭐⭐ | ⭐⭐⭐⭐ |
|
||||
|
||||
## 1. InfluxDB Deployment (Recommended for Time-Series)
|
||||
|
||||
### Why InfluxDB?
|
||||
- **Purpose-built** for time-series data
|
||||
- **Excellent compression** (10:1 typical ratio)
|
||||
- **Built-in retention policies** and downsampling
|
||||
- **Great Grafana integration** for dashboards
|
||||
- **High write throughput** (100k+ points/second)
|
||||
|
||||
### Docker Deployment
|
||||
|
||||
```yaml
|
||||
# docker-compose.yml
|
||||
version: '3.8'
|
||||
services:
|
||||
influxdb:
|
||||
image: influxdb:1.8
|
||||
container_name: water_influxdb
|
||||
ports:
|
||||
- "8086:8086"
|
||||
volumes:
|
||||
- influxdb_data:/var/lib/influxdb
|
||||
- ./influxdb.conf:/etc/influxdb/influxdb.conf:ro
|
||||
environment:
|
||||
- INFLUXDB_DB=water_monitoring
|
||||
- INFLUXDB_ADMIN_USER=admin
|
||||
- INFLUXDB_ADMIN_PASSWORD=your_secure_password
|
||||
- INFLUXDB_USER=water_user
|
||||
- INFLUXDB_USER_PASSWORD=water_password
|
||||
restart: unless-stopped
|
||||
|
||||
grafana:
|
||||
image: grafana/grafana:latest
|
||||
container_name: water_grafana
|
||||
ports:
|
||||
- "3000:3000"
|
||||
volumes:
|
||||
- grafana_data:/var/lib/grafana
|
||||
environment:
|
||||
- GF_SECURITY_ADMIN_PASSWORD=admin_password
|
||||
restart: unless-stopped
|
||||
|
||||
volumes:
|
||||
influxdb_data:
|
||||
grafana_data:
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
```bash
|
||||
# .env file
|
||||
DB_TYPE=influxdb
|
||||
INFLUX_HOST=localhost
|
||||
INFLUX_PORT=8086
|
||||
INFLUX_DATABASE=water_monitoring
|
||||
INFLUX_USERNAME=water_user
|
||||
INFLUX_PASSWORD=water_password
|
||||
```
|
||||
|
||||
### InfluxDB Configuration
|
||||
```toml
|
||||
# influxdb.conf
|
||||
[meta]
|
||||
dir = "/var/lib/influxdb/meta"
|
||||
|
||||
[data]
|
||||
dir = "/var/lib/influxdb/data"
|
||||
wal-dir = "/var/lib/influxdb/wal"
|
||||
|
||||
# Optimize for time-series data
|
||||
cache-max-memory-size = "1g"
|
||||
cache-snapshot-memory-size = "25m"
|
||||
cache-snapshot-write-cold-duration = "10m"
|
||||
|
||||
# Retention and compression
|
||||
compact-full-write-cold-duration = "4h"
|
||||
max-series-per-database = 1000000
|
||||
max-values-per-tag = 100000
|
||||
|
||||
[coordinator]
|
||||
write-timeout = "10s"
|
||||
max-concurrent-queries = 0
|
||||
query-timeout = "0s"
|
||||
|
||||
[retention]
|
||||
enabled = true
|
||||
check-interval = "30m"
|
||||
|
||||
[http]
|
||||
enabled = true
|
||||
bind-address = ":8086"
|
||||
auth-enabled = true
|
||||
max-body-size = "25000000"
|
||||
max-concurrent-requests = 0
|
||||
max-enqueued-requests = 0
|
||||
```
|
||||
|
||||
### Production Setup Commands
|
||||
```bash
|
||||
# Start services
|
||||
docker-compose up -d
|
||||
|
||||
# Create retention policies
|
||||
docker exec -it water_influxdb influx -username admin -password your_secure_password -execute "
|
||||
CREATE RETENTION POLICY \"raw_data\" ON \"water_monitoring\" DURATION 90d REPLICATION 1 DEFAULT;
|
||||
CREATE RETENTION POLICY \"downsampled\" ON \"water_monitoring\" DURATION 730d REPLICATION 1;
|
||||
"
|
||||
|
||||
# Create continuous queries for downsampling
|
||||
docker exec -it water_influxdb influx -username admin -password your_secure_password -execute "
|
||||
CREATE CONTINUOUS QUERY \"downsample_hourly\" ON \"water_monitoring\"
|
||||
BEGIN
|
||||
SELECT mean(water_level) AS water_level, mean(discharge) AS discharge, mean(discharge_percent) AS discharge_percent
|
||||
INTO \"downsampled\".\"water_data_hourly\"
|
||||
FROM \"water_data\"
|
||||
GROUP BY time(1h), station_code, station_name_en, station_name_th
|
||||
END
|
||||
"
|
||||
```
|
||||
|
||||
## 2. VictoriaMetrics Deployment (High Performance)
|
||||
|
||||
### Why VictoriaMetrics?
|
||||
- **Extremely fast** and resource-efficient
|
||||
- **Better compression** than InfluxDB
|
||||
- **Prometheus-compatible** API
|
||||
- **Lower memory usage**
|
||||
- **Built-in clustering**
|
||||
|
||||
### Docker Deployment
|
||||
```yaml
|
||||
# docker-compose.yml
|
||||
version: '3.8'
|
||||
services:
|
||||
victoriametrics:
|
||||
image: victoriametrics/victoria-metrics:latest
|
||||
container_name: water_victoriametrics
|
||||
ports:
|
||||
- "8428:8428"
|
||||
volumes:
|
||||
- vm_data:/victoria-metrics-data
|
||||
command:
|
||||
- '--storageDataPath=/victoria-metrics-data'
|
||||
- '--retentionPeriod=2y'
|
||||
- '--httpListenAddr=:8428'
|
||||
- '--maxConcurrentInserts=16'
|
||||
restart: unless-stopped
|
||||
|
||||
volumes:
|
||||
vm_data:
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
```bash
|
||||
# .env file
|
||||
DB_TYPE=victoriametrics
|
||||
VM_HOST=localhost
|
||||
VM_PORT=8428
|
||||
```
|
||||
|
||||
## 3. PostgreSQL Deployment (Relational + Time-Series)
|
||||
|
||||
### Why PostgreSQL?
|
||||
- **Mature and reliable**
|
||||
- **Excellent for complex queries**
|
||||
- **TimescaleDB extension** for time-series optimization
|
||||
- **Strong consistency guarantees**
|
||||
- **Rich ecosystem**
|
||||
|
||||
### Docker Deployment with TimescaleDB
|
||||
```yaml
|
||||
# docker-compose.yml
|
||||
version: '3.8'
|
||||
services:
|
||||
postgres:
|
||||
image: timescale/timescaledb:latest-pg14
|
||||
container_name: water_postgres
|
||||
ports:
|
||||
- "5432:5432"
|
||||
volumes:
|
||||
- postgres_data:/var/lib/postgresql/data
|
||||
- ./init.sql:/docker-entrypoint-initdb.d/init.sql
|
||||
environment:
|
||||
- POSTGRES_DB=water_monitoring
|
||||
- POSTGRES_USER=water_user
|
||||
- POSTGRES_PASSWORD=secure_password
|
||||
restart: unless-stopped
|
||||
|
||||
volumes:
|
||||
postgres_data:
|
||||
```
|
||||
|
||||
### Database Initialization
|
||||
```sql
|
||||
-- init.sql
|
||||
CREATE EXTENSION IF NOT EXISTS timescaledb CASCADE;
|
||||
|
||||
-- Create hypertable for time-series optimization
|
||||
CREATE TABLE water_measurements (
|
||||
id BIGSERIAL PRIMARY KEY,
|
||||
timestamp TIMESTAMPTZ NOT NULL,
|
||||
station_id INT NOT NULL,
|
||||
water_level NUMERIC(10,3),
|
||||
discharge NUMERIC(10,2),
|
||||
discharge_percent NUMERIC(5,2),
|
||||
status VARCHAR(20) DEFAULT 'active',
|
||||
created_at TIMESTAMPTZ DEFAULT NOW()
|
||||
);
|
||||
|
||||
-- Convert to hypertable (TimescaleDB)
|
||||
SELECT create_hypertable('water_measurements', 'timestamp', chunk_time_interval => INTERVAL '1 day');
|
||||
|
||||
-- Create indexes
|
||||
CREATE INDEX idx_water_measurements_station_time ON water_measurements (station_id, timestamp DESC);
|
||||
CREATE INDEX idx_water_measurements_timestamp ON water_measurements (timestamp DESC);
|
||||
|
||||
-- Create retention policy (keep raw data for 2 years)
|
||||
SELECT add_retention_policy('water_measurements', INTERVAL '2 years');
|
||||
|
||||
-- Create continuous aggregates for performance
|
||||
CREATE MATERIALIZED VIEW water_measurements_hourly
|
||||
WITH (timescaledb.continuous) AS
|
||||
SELECT
|
||||
time_bucket('1 hour', timestamp) AS bucket,
|
||||
station_id,
|
||||
AVG(water_level) as avg_water_level,
|
||||
MAX(water_level) as max_water_level,
|
||||
MIN(water_level) as min_water_level,
|
||||
AVG(discharge) as avg_discharge,
|
||||
MAX(discharge) as max_discharge,
|
||||
MIN(discharge) as min_discharge,
|
||||
AVG(discharge_percent) as avg_discharge_percent
|
||||
FROM water_measurements
|
||||
GROUP BY bucket, station_id;
|
||||
|
||||
-- Refresh policy for continuous aggregates
|
||||
SELECT add_continuous_aggregate_policy('water_measurements_hourly',
|
||||
start_offset => INTERVAL '1 day',
|
||||
end_offset => INTERVAL '1 hour',
|
||||
schedule_interval => INTERVAL '1 hour');
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
```bash
|
||||
# .env file
|
||||
DB_TYPE=postgresql
|
||||
POSTGRES_CONNECTION_STRING=postgresql://water_user:secure_password@localhost:5432/water_monitoring
|
||||
```
|
||||
|
||||
## 4. MySQL Deployment (Traditional Relational)
|
||||
|
||||
### Docker Deployment
|
||||
```yaml
|
||||
# docker-compose.yml
|
||||
version: '3.8'
|
||||
services:
|
||||
mysql:
|
||||
image: mysql:8.0
|
||||
container_name: water_mysql
|
||||
ports:
|
||||
- "3306:3306"
|
||||
volumes:
|
||||
- mysql_data:/var/lib/mysql
|
||||
- ./mysql.cnf:/etc/mysql/conf.d/mysql.cnf
|
||||
- ./init.sql:/docker-entrypoint-initdb.d/init.sql
|
||||
environment:
|
||||
- MYSQL_ROOT_PASSWORD=root_password
|
||||
- MYSQL_DATABASE=water_monitoring
|
||||
- MYSQL_USER=water_user
|
||||
- MYSQL_PASSWORD=water_password
|
||||
restart: unless-stopped
|
||||
|
||||
volumes:
|
||||
mysql_data:
|
||||
```
|
||||
|
||||
### MySQL Configuration
|
||||
```ini
|
||||
# mysql.cnf
|
||||
[mysqld]
|
||||
# Optimize for time-series data
|
||||
innodb_buffer_pool_size = 1G
|
||||
innodb_log_file_size = 256M
|
||||
innodb_flush_log_at_trx_commit = 2
|
||||
innodb_flush_method = O_DIRECT
|
||||
|
||||
# Partitioning support
|
||||
partition = ON
|
||||
|
||||
# Query cache
|
||||
query_cache_type = 1
|
||||
query_cache_size = 128M
|
||||
|
||||
# Connection settings
|
||||
max_connections = 200
|
||||
connect_timeout = 10
|
||||
wait_timeout = 600
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
```bash
|
||||
# .env file
|
||||
DB_TYPE=mysql
|
||||
MYSQL_CONNECTION_STRING=mysql://water_user:water_password@localhost:3306/water_monitoring
|
||||
```
|
||||
|
||||
## 5. Installation and Dependencies
|
||||
|
||||
### Required Python Packages
|
||||
```bash
|
||||
# Base requirements
|
||||
pip install requests schedule
|
||||
|
||||
# Database-specific packages
|
||||
pip install influxdb # For InfluxDB
|
||||
pip install sqlalchemy pymysql # For MySQL
|
||||
pip install sqlalchemy psycopg2-binary # For PostgreSQL
|
||||
# VictoriaMetrics uses HTTP API (no extra packages needed)
|
||||
```
|
||||
|
||||
### Updated requirements.txt
|
||||
```txt
|
||||
requests>=2.28.0
|
||||
schedule>=1.2.0
|
||||
pandas>=1.5.0
|
||||
|
||||
# Database adapters (install as needed)
|
||||
influxdb>=5.3.1
|
||||
sqlalchemy>=1.4.0
|
||||
pymysql>=1.0.2
|
||||
psycopg2-binary>=2.9.0
|
||||
```
|
||||
|
||||
## 6. Production Deployment Examples
|
||||
|
||||
### Using InfluxDB (Recommended)
|
||||
```bash
|
||||
# Set environment variables
|
||||
export DB_TYPE=influxdb
|
||||
export INFLUX_HOST=your-influx-server.com
|
||||
export INFLUX_PORT=8086
|
||||
export INFLUX_DATABASE=water_monitoring
|
||||
export INFLUX_USERNAME=water_user
|
||||
export INFLUX_PASSWORD=your_secure_password
|
||||
|
||||
# Run the scraper
|
||||
python water_scraper_v3.py
|
||||
```
|
||||
|
||||
### Using PostgreSQL with TimescaleDB
|
||||
```bash
|
||||
# Set environment variables
|
||||
export DB_TYPE=postgresql
|
||||
export POSTGRES_CONNECTION_STRING=postgresql://water_user:password@your-postgres-server.com:5432/water_monitoring
|
||||
|
||||
# Run the scraper
|
||||
python water_scraper_v3.py
|
||||
```
|
||||
|
||||
### Using VictoriaMetrics
|
||||
```bash
|
||||
# Set environment variables
|
||||
export DB_TYPE=victoriametrics
|
||||
export VM_HOST=your-vm-server.com
|
||||
export VM_PORT=8428
|
||||
|
||||
# Run the scraper
|
||||
python water_scraper_v3.py
|
||||
```
|
||||
|
||||
## 7. Monitoring and Alerting
|
||||
|
||||
### Grafana Dashboard Setup
|
||||
1. **Add Data Source**: Configure your database as a Grafana data source
|
||||
2. **Import Dashboard**: Use pre-built water monitoring dashboards
|
||||
3. **Set Alerts**: Configure alerts for abnormal water levels or discharge rates
|
||||
|
||||
### Example Grafana Queries
|
||||
|
||||
#### InfluxDB Queries
|
||||
```sql
|
||||
-- Current water levels
|
||||
SELECT last("water_level") FROM "water_data" GROUP BY "station_code"
|
||||
|
||||
-- Discharge trends (last 24h)
|
||||
SELECT mean("discharge") FROM "water_data" WHERE time >= now() - 24h GROUP BY time(1h), "station_code"
|
||||
```
|
||||
|
||||
#### PostgreSQL/TimescaleDB Queries
|
||||
```sql
|
||||
-- Current water levels
|
||||
SELECT DISTINCT ON (station_id)
|
||||
station_id, water_level, discharge, timestamp
|
||||
FROM water_measurements
|
||||
ORDER BY station_id, timestamp DESC;
|
||||
|
||||
-- Hourly averages (last 24h)
|
||||
SELECT
|
||||
time_bucket('1 hour', timestamp) as hour,
|
||||
station_id,
|
||||
AVG(water_level) as avg_level,
|
||||
AVG(discharge) as avg_discharge
|
||||
FROM water_measurements
|
||||
WHERE timestamp >= NOW() - INTERVAL '24 hours'
|
||||
GROUP BY hour, station_id
|
||||
ORDER BY hour DESC;
|
||||
```
|
||||
|
||||
## 8. Performance Optimization Tips
|
||||
|
||||
### For All Databases
|
||||
- **Batch inserts**: Insert multiple measurements at once
|
||||
- **Connection pooling**: Reuse database connections
|
||||
- **Indexing**: Ensure proper indexes on timestamp and station_id
|
||||
- **Retention policies**: Automatically delete old data
|
||||
|
||||
### InfluxDB Specific
|
||||
- Use **tags** for metadata (station codes, names)
|
||||
- Use **fields** for numeric values (water levels, discharge)
|
||||
- Configure **retention policies** and **continuous queries**
|
||||
- Enable **compression** for long-term storage
|
||||
|
||||
### PostgreSQL/TimescaleDB Specific
|
||||
- Use **hypertables** for automatic partitioning
|
||||
- Create **continuous aggregates** for common queries
|
||||
- Configure **compression** for older chunks
|
||||
- Use **parallel queries** for large datasets
|
||||
|
||||
### VictoriaMetrics Specific
|
||||
- Use **labels** efficiently (similar to Prometheus)
|
||||
- Configure **retention periods** appropriately
|
||||
- Use **downsampling** for long-term storage
|
||||
- Enable **deduplication** if needed
|
||||
|
||||
This deployment guide provides production-ready configurations for all supported database backends. Choose the one that best fits your infrastructure and requirements.
|
||||
329
docs/DEBIAN_TROUBLESHOOTING.md
Normal file
329
docs/DEBIAN_TROUBLESHOOTING.md
Normal file
@@ -0,0 +1,329 @@
|
||||
# Debian/Linux Troubleshooting Guide
|
||||
|
||||
This guide addresses common issues when running the Thailand Water Monitor on Debian and other Linux distributions.
|
||||
|
||||
## Fixed Issues
|
||||
|
||||
### SQLAlchemy Connection Error (RESOLVED)
|
||||
|
||||
**Error Message:**
|
||||
```
|
||||
2025-07-24 19:48:31,920 - ERROR - Failed to connect to SQLITE: 'Connection' object has no attribute 'commit'
|
||||
2025-07-24 19:48:32,740 - ERROR - Error saving to SQLITE: 'Connection' object has no attribute 'commit'
|
||||
```
|
||||
|
||||
**Root Cause:**
|
||||
This error occurred due to incompatibility between the database adapter code and newer versions of SQLAlchemy. The code was calling `conn.commit()` on a connection object that doesn't have a `commit()` method in newer SQLAlchemy versions.
|
||||
|
||||
**Solution Applied:**
|
||||
Changed from `engine.connect()` to `engine.begin()` context manager, which automatically handles transactions:
|
||||
|
||||
```python
|
||||
# OLD (problematic) code:
|
||||
with self.engine.connect() as conn:
|
||||
conn.execute(text(sql))
|
||||
conn.commit() # This fails in newer SQLAlchemy
|
||||
|
||||
# NEW (fixed) code:
|
||||
with self.engine.begin() as conn:
|
||||
conn.execute(text(sql))
|
||||
# Transaction automatically committed when context exits
|
||||
```
|
||||
|
||||
**Status:** ✅ **FIXED** - The issue has been resolved in the current version.
|
||||
|
||||
## Installation on Debian/Ubuntu
|
||||
|
||||
### System Requirements
|
||||
|
||||
```bash
|
||||
# Update package list
|
||||
sudo apt update
|
||||
|
||||
# Install Python and pip
|
||||
sudo apt install python3 python3-pip python3-venv
|
||||
|
||||
# Install system dependencies for database drivers
|
||||
sudo apt install build-essential python3-dev
|
||||
|
||||
# For MySQL support (optional)
|
||||
sudo apt install default-libmysqlclient-dev
|
||||
|
||||
# For PostgreSQL support (optional)
|
||||
sudo apt install libpq-dev
|
||||
```
|
||||
|
||||
### Python Environment Setup
|
||||
|
||||
```bash
|
||||
# Create virtual environment
|
||||
python3 -m venv water_monitor_env
|
||||
|
||||
# Activate virtual environment
|
||||
source water_monitor_env/bin/activate
|
||||
|
||||
# Install requirements
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
### Running the Monitor
|
||||
|
||||
```bash
|
||||
# Test run
|
||||
python water_scraper_v3.py --test
|
||||
|
||||
# Run with specific database
|
||||
export DB_TYPE=sqlite
|
||||
python water_scraper_v3.py
|
||||
|
||||
# Run demo
|
||||
python demo_databases.py
|
||||
```
|
||||
|
||||
## Common Linux Issues
|
||||
|
||||
### 1. Permission Errors
|
||||
|
||||
**Error:**
|
||||
```
|
||||
PermissionError: [Errno 13] Permission denied: 'water_levels.db'
|
||||
```
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Check current directory permissions
|
||||
ls -la
|
||||
|
||||
# Create data directory with proper permissions
|
||||
mkdir -p data
|
||||
chmod 755 data
|
||||
|
||||
# Set database path to data directory
|
||||
export WATER_DB_PATH=data/water_levels.db
|
||||
```
|
||||
|
||||
### 2. Missing System Dependencies
|
||||
|
||||
**Error:**
|
||||
```
|
||||
ImportError: No module named '_sqlite3'
|
||||
```
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Install SQLite development headers
|
||||
sudo apt install libsqlite3-dev
|
||||
|
||||
# Reinstall Python if needed
|
||||
sudo apt install python3-sqlite
|
||||
```
|
||||
|
||||
### 3. Network/Firewall Issues
|
||||
|
||||
**Error:**
|
||||
```
|
||||
requests.exceptions.ConnectionError: HTTPSConnectionPool
|
||||
```
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Test network connectivity
|
||||
curl -I https://hyd-app-db.rid.go.th/hydro1h.html
|
||||
|
||||
# Check firewall rules
|
||||
sudo ufw status
|
||||
|
||||
# Allow outbound HTTPS if needed
|
||||
sudo ufw allow out 443
|
||||
```
|
||||
|
||||
### 4. Systemd Service Setup
|
||||
|
||||
Create service file `/etc/systemd/system/water-monitor.service`:
|
||||
|
||||
```ini
|
||||
[Unit]
|
||||
Description=Thailand Water Level Monitor
|
||||
After=network.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
User=water-monitor
|
||||
Group=water-monitor
|
||||
WorkingDirectory=/opt/water_level_monitor
|
||||
Environment=PATH=/opt/water_level_monitor/venv/bin
|
||||
Environment=DB_TYPE=sqlite
|
||||
Environment=WATER_DB_PATH=/opt/water_level_monitor/data/water_levels.db
|
||||
ExecStart=/opt/water_level_monitor/venv/bin/python water_scraper_v3.py
|
||||
Restart=always
|
||||
RestartSec=60
|
||||
|
||||
# Security settings
|
||||
NoNewPrivileges=true
|
||||
PrivateTmp=true
|
||||
ProtectSystem=strict
|
||||
ProtectHome=true
|
||||
ReadWritePaths=/opt/water_level_monitor/data
|
||||
ReadWritePaths=/opt/water_level_monitor/logs
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
```
|
||||
|
||||
Enable and start:
|
||||
```bash
|
||||
sudo systemctl daemon-reload
|
||||
sudo systemctl enable water-monitor.service
|
||||
sudo systemctl start water-monitor.service
|
||||
sudo systemctl status water-monitor.service
|
||||
```
|
||||
|
||||
### 5. Log Rotation
|
||||
|
||||
Create `/etc/logrotate.d/water-monitor`:
|
||||
|
||||
```
|
||||
/opt/water_level_monitor/water_monitor.log {
|
||||
daily
|
||||
missingok
|
||||
rotate 30
|
||||
compress
|
||||
delaycompress
|
||||
notifempty
|
||||
create 644 water-monitor water-monitor
|
||||
postrotate
|
||||
systemctl reload water-monitor.service
|
||||
endscript
|
||||
}
|
||||
```
|
||||
|
||||
## Database-Specific Issues
|
||||
|
||||
### SQLite
|
||||
|
||||
**Issue:** Database locked
|
||||
```bash
|
||||
# Check for processes using the database
|
||||
sudo lsof /path/to/water_levels.db
|
||||
|
||||
# Kill processes if needed
|
||||
sudo pkill -f water_scraper_v3.py
|
||||
```
|
||||
|
||||
### VictoriaMetrics with HTTPS
|
||||
|
||||
**Configuration:**
|
||||
```bash
|
||||
export DB_TYPE=victoriametrics
|
||||
export VM_HOST=https://your-vm-server.com
|
||||
export VM_PORT=443
|
||||
```
|
||||
|
||||
**Test connection:**
|
||||
```bash
|
||||
curl -k https://your-vm-server.com/health
|
||||
```
|
||||
|
||||
## Performance Optimization
|
||||
|
||||
### 1. System Tuning
|
||||
|
||||
```bash
|
||||
# Increase file descriptor limits
|
||||
echo "* soft nofile 65536" >> /etc/security/limits.conf
|
||||
echo "* hard nofile 65536" >> /etc/security/limits.conf
|
||||
|
||||
# Optimize network settings
|
||||
echo "net.core.rmem_max = 16777216" >> /etc/sysctl.conf
|
||||
echo "net.core.wmem_max = 16777216" >> /etc/sysctl.conf
|
||||
sysctl -p
|
||||
```
|
||||
|
||||
### 2. Database Optimization
|
||||
|
||||
```bash
|
||||
# For SQLite
|
||||
export SQLITE_CACHE_SIZE=10000
|
||||
export SQLITE_SYNCHRONOUS=NORMAL
|
||||
|
||||
# Monitor database size
|
||||
du -h data/water_levels.db
|
||||
```
|
||||
|
||||
## Monitoring and Maintenance
|
||||
|
||||
### Health Check Script
|
||||
|
||||
Create `health_check.sh`:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
LOG_FILE="/opt/water_level_monitor/water_monitor.log"
|
||||
SERVICE_NAME="water-monitor"
|
||||
|
||||
# Check if service is running
|
||||
if ! systemctl is-active --quiet $SERVICE_NAME; then
|
||||
echo "ERROR: $SERVICE_NAME is not running"
|
||||
systemctl restart $SERVICE_NAME
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Check recent log entries
|
||||
RECENT_ERRORS=$(tail -n 100 $LOG_FILE | grep -c "ERROR")
|
||||
if [ $RECENT_ERRORS -gt 5 ]; then
|
||||
echo "WARNING: $RECENT_ERRORS errors found in recent logs"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "OK: Service is healthy"
|
||||
exit 0
|
||||
```
|
||||
|
||||
### Cron Job for Health Checks
|
||||
|
||||
```bash
|
||||
# Add to crontab
|
||||
*/5 * * * * /opt/water_level_monitor/health_check.sh >> /var/log/water-monitor-health.log 2>&1
|
||||
```
|
||||
|
||||
## Getting Help
|
||||
|
||||
### Debug Information
|
||||
|
||||
```bash
|
||||
# System information
|
||||
uname -a
|
||||
python3 --version
|
||||
pip list | grep -E "(sqlalchemy|requests|influxdb)"
|
||||
|
||||
# Service logs
|
||||
journalctl -u water-monitor.service -f
|
||||
|
||||
# Application logs
|
||||
tail -f water_monitor.log
|
||||
|
||||
# Database information
|
||||
sqlite3 water_levels.db ".schema"
|
||||
sqlite3 water_levels.db "SELECT COUNT(*) FROM water_measurements;"
|
||||
```
|
||||
|
||||
### Common Commands
|
||||
|
||||
```bash
|
||||
# Restart service
|
||||
sudo systemctl restart water-monitor.service
|
||||
|
||||
# View logs
|
||||
sudo journalctl -u water-monitor.service --since "1 hour ago"
|
||||
|
||||
# Test configuration
|
||||
python config.py
|
||||
|
||||
# Test database connection
|
||||
python demo_databases.py
|
||||
|
||||
# Manual data fetch
|
||||
python water_scraper_v3.py --test
|
||||
```
|
||||
|
||||
This troubleshooting guide should help resolve most common issues encountered when running the Thailand Water Monitor on Debian and other Linux distributions.
|
||||
293
docs/ENHANCED_SCHEDULER_GUIDE.md
Normal file
293
docs/ENHANCED_SCHEDULER_GUIDE.md
Normal file
@@ -0,0 +1,293 @@
|
||||
# Enhanced Scheduler Guide
|
||||
|
||||
This guide explains the new 15-minute scheduling system that runs continuously throughout each hour to ensure comprehensive data coverage.
|
||||
|
||||
## ✅ **New Scheduling Behavior**
|
||||
|
||||
### **15-Minute Schedule Pattern**
|
||||
- **Timing**: Runs every 15 minutes: 1:00, 1:15, 1:30, 1:45, 2:00, 2:15, 2:30, 2:45, etc.
|
||||
- **Hourly Full Checks**: At :00 minutes (includes gap filling and data updates)
|
||||
- **Quarter-Hour Quick Checks**: At :15, :30, :45 minutes (data fetch only)
|
||||
- **Continuous Coverage**: Ensures no data is missed throughout each hour
|
||||
|
||||
### **Operation Types**
|
||||
- **Full Operations** (at :00): Data fetching + gap filling + data updates
|
||||
- **Quick Operations** (at :15, :30, :45): Data fetching only for performance
|
||||
|
||||
## 🔧 **Technical Implementation**
|
||||
|
||||
### **Scheduler States**
|
||||
```python
|
||||
# State tracking variables
|
||||
self.last_successful_update = None # Timestamp of last successful data update
|
||||
self.retry_mode = False # Whether in quick check mode (skip gap filling)
|
||||
self.next_hourly_check = None # Next scheduled hourly check
|
||||
```
|
||||
|
||||
### **Quarter-Hour Check Process**
|
||||
```python
|
||||
def quarter_hour_check(self):
|
||||
"""15-minute check for new data"""
|
||||
current_time = datetime.datetime.now()
|
||||
minute = current_time.minute
|
||||
|
||||
# Determine if this is a full hourly check (at :00) or a quarter-hour check
|
||||
if minute == 0:
|
||||
logging.info("=== HOURLY CHECK (00:00) ===")
|
||||
self.retry_mode = False # Full check with gap filling and updates
|
||||
else:
|
||||
logging.info(f"=== 15-MINUTE CHECK ({minute:02d}:00) ===")
|
||||
self.retry_mode = True # Skip gap filling and updates on 15-min checks
|
||||
|
||||
new_data_found = self.run_scraping_cycle()
|
||||
|
||||
if new_data_found:
|
||||
self.last_successful_update = datetime.datetime.now()
|
||||
if minute == 0:
|
||||
logging.info("New data found during hourly check")
|
||||
else:
|
||||
logging.info(f"New data found during 15-minute check at :{minute:02d}")
|
||||
else:
|
||||
if minute == 0:
|
||||
logging.info("No new data found during hourly check")
|
||||
else:
|
||||
logging.info(f"No new data found during 15-minute check at :{minute:02d}")
|
||||
```
|
||||
|
||||
### **Scheduler Setup**
|
||||
```python
|
||||
def start_scheduler(self):
|
||||
"""Start enhanced scheduler with 15-minute checks"""
|
||||
# Schedule checks every 15 minutes (at :00, :15, :30, :45)
|
||||
schedule.every().hour.at(":00").do(self.quarter_hour_check)
|
||||
schedule.every().hour.at(":15").do(self.quarter_hour_check)
|
||||
schedule.every().hour.at(":30").do(self.quarter_hour_check)
|
||||
schedule.every().hour.at(":45").do(self.quarter_hour_check)
|
||||
|
||||
while True:
|
||||
schedule.run_pending()
|
||||
time.sleep(30) # Check every 30 seconds
|
||||
```
|
||||
|
||||
## 📊 **New Data Detection Logic**
|
||||
|
||||
### **Smart Detection Algorithm**
|
||||
```python
|
||||
def has_new_data(self) -> bool:
|
||||
"""Check if there is new data available since last successful update"""
|
||||
# Get most recent timestamp from database
|
||||
latest_data = self.get_latest_data(limit=1)
|
||||
|
||||
# Check if we should have newer data by now
|
||||
now = datetime.datetime.now()
|
||||
expected_latest = now.replace(minute=0, second=0, microsecond=0)
|
||||
|
||||
# If current time is past 5 minutes after the hour, we should have data
|
||||
if now.minute >= 5:
|
||||
if latest_timestamp < expected_latest:
|
||||
return True # New data expected
|
||||
|
||||
# Check if we have data for the previous hour
|
||||
previous_hour = expected_latest - datetime.timedelta(hours=1)
|
||||
if latest_timestamp < previous_hour:
|
||||
return True # Missing recent data
|
||||
|
||||
return False # Data is up to date
|
||||
```
|
||||
|
||||
### **Actual Data Verification**
|
||||
```python
|
||||
# Compare timestamps before and after scraping
|
||||
initial_timestamp = get_latest_timestamp_before_scraping()
|
||||
# ... perform scraping ...
|
||||
latest_timestamp = get_latest_timestamp_after_scraping()
|
||||
|
||||
if initial_timestamp is None or latest_timestamp > initial_timestamp:
|
||||
new_data_found = True
|
||||
self.last_successful_update = datetime.datetime.now()
|
||||
```
|
||||
|
||||
## 🚀 **Operational Modes**
|
||||
|
||||
### **Mode 1: Full Hourly Operation (at :00)**
|
||||
- **Schedule**: Every hour at :00 minutes (1:00, 2:00, 3:00, etc.)
|
||||
- **Operations**:
|
||||
- ✅ Fetch current data
|
||||
- ✅ Fill data gaps (last 7 days)
|
||||
- ✅ Update existing data (last 2 days)
|
||||
- **Purpose**: Comprehensive data collection and maintenance
|
||||
|
||||
### **Mode 2: Quick 15-Minute Checks (at :15, :30, :45)**
|
||||
- **Schedule**: Every 15 minutes at quarter-hour marks
|
||||
- **Operations**:
|
||||
- ✅ Fetch current data only
|
||||
- ❌ Skip gap filling (performance optimization)
|
||||
- ❌ Skip data updates (performance optimization)
|
||||
- **Purpose**: Ensure no new data is missed between hourly checks
|
||||
|
||||
## 📋 **Logging Output Examples**
|
||||
|
||||
### **Successful Hourly Check (at :00)**
|
||||
```
|
||||
2025-07-26 01:00:00,123 - INFO - === HOURLY CHECK (00:00) ===
|
||||
2025-07-26 01:00:00,124 - INFO - Starting scraping cycle...
|
||||
2025-07-26 01:00:01,456 - INFO - Successfully fetched 384 data points from API
|
||||
2025-07-26 01:00:02,789 - INFO - New data found: 2025-07-26 01:00:00
|
||||
2025-07-26 01:00:03,012 - INFO - Filled 5 data gaps
|
||||
2025-07-26 01:00:04,234 - INFO - Updated 2 existing measurements
|
||||
2025-07-26 01:00:04,235 - INFO - New data found during hourly check
|
||||
```
|
||||
|
||||
### **15-Minute Quick Check (at :15, :30, :45)**
|
||||
```
|
||||
2025-07-26 01:15:00,123 - INFO - === 15-MINUTE CHECK (15:00) ===
|
||||
2025-07-26 01:15:00,124 - INFO - Starting scraping cycle...
|
||||
2025-07-26 01:15:01,456 - INFO - Successfully fetched 299 data points from API
|
||||
2025-07-26 01:15:02,789 - INFO - New data found: 2025-07-26 01:00:00
|
||||
2025-07-26 01:15:02,790 - INFO - New data found during 15-minute check at :15
|
||||
```
|
||||
|
||||
### **Continuous 15-Minute Pattern**
|
||||
```
|
||||
2025-07-26 01:00:00,123 - INFO - === HOURLY CHECK (00:00) ===
|
||||
2025-07-26 01:00:04,235 - INFO - New data found during hourly check
|
||||
|
||||
2025-07-26 01:15:00,123 - INFO - === 15-MINUTE CHECK (15:00) ===
|
||||
2025-07-26 01:15:02,790 - INFO - No new data found during 15-minute check at :15
|
||||
|
||||
2025-07-26 01:30:00,123 - INFO - === 15-MINUTE CHECK (30:00) ===
|
||||
2025-07-26 01:30:02,790 - INFO - No new data found during 15-minute check at :30
|
||||
|
||||
2025-07-26 01:45:00,123 - INFO - === 15-MINUTE CHECK (45:00) ===
|
||||
2025-07-26 01:45:02,790 - INFO - No new data found during 15-minute check at :45
|
||||
|
||||
2025-07-26 02:00:00,123 - INFO - === HOURLY CHECK (00:00) ===
|
||||
2025-07-26 02:00:04,235 - INFO - New data found during hourly check
|
||||
```
|
||||
|
||||
## ⚙️ **Configuration Options**
|
||||
|
||||
### **Environment Variables**
|
||||
```bash
|
||||
# Retry interval (default: 5 minutes)
|
||||
export RETRY_INTERVAL_MINUTES=5
|
||||
|
||||
# Data availability buffer (default: 5 minutes after hour)
|
||||
export DATA_BUFFER_MINUTES=5
|
||||
|
||||
# Gap filling days (default: 7 days)
|
||||
export GAP_FILL_DAYS=7
|
||||
|
||||
# Update check days (default: 2 days)
|
||||
export UPDATE_DAYS=2
|
||||
```
|
||||
|
||||
### **Scheduler Timing**
|
||||
```python
|
||||
# Hourly checks at top of hour
|
||||
schedule.every().hour.at(":00").do(self.hourly_check)
|
||||
|
||||
# 5-minute retries (dynamically scheduled)
|
||||
schedule.every(5).minutes.do(self.retry_check).tag('retry')
|
||||
|
||||
# Check every 30 seconds for responsive retry scheduling
|
||||
time.sleep(30)
|
||||
```
|
||||
|
||||
## 🔍 **Performance Optimizations**
|
||||
|
||||
### **Retry Mode Optimizations**
|
||||
- **Skip Gap Filling**: Avoids expensive historical data fetching during retries
|
||||
- **Skip Data Updates**: Avoids comparison operations during retries
|
||||
- **Focused API Calls**: Only fetches current day data during retries
|
||||
- **Reduced Database Queries**: Minimal database operations during retries
|
||||
|
||||
### **Resource Management**
|
||||
- **API Rate Limiting**: 1-second delays between API calls
|
||||
- **Database Connection Pooling**: Efficient connection reuse
|
||||
- **Memory Efficiency**: Selective data processing
|
||||
- **Error Recovery**: Automatic retry with exponential backoff
|
||||
|
||||
## 🛠️ **Troubleshooting**
|
||||
|
||||
### **Common Scenarios**
|
||||
|
||||
#### **Stuck in Retry Mode**
|
||||
```
|
||||
# Check if API is returning data
|
||||
curl -X POST https://hyd-app-db.rid.go.th/webservice/getGroupHourlyWaterLevelReportAllHL.ashx
|
||||
|
||||
# Check database connectivity
|
||||
python water_scraper_v3.py --check-gaps 1
|
||||
|
||||
# Manual data fetch test
|
||||
python water_scraper_v3.py --test
|
||||
```
|
||||
|
||||
#### **Missing Hourly Triggers**
|
||||
```
|
||||
# Check system time synchronization
|
||||
timedatectl status
|
||||
|
||||
# Verify scheduler is running
|
||||
ps aux | grep water_scraper
|
||||
|
||||
# Check logs for scheduler activity
|
||||
tail -f water_monitor.log | grep "HOURLY CHECK"
|
||||
```
|
||||
|
||||
#### **False New Data Detection**
|
||||
```
|
||||
# Check latest data in database
|
||||
sqlite3 water_monitoring.db "SELECT MAX(timestamp) FROM water_measurements;"
|
||||
|
||||
# Verify timestamp parsing
|
||||
python -c "
|
||||
import datetime
|
||||
print('Current hour:', datetime.datetime.now().replace(minute=0, second=0, microsecond=0))
|
||||
"
|
||||
```
|
||||
|
||||
## 📈 **Monitoring and Alerts**
|
||||
|
||||
### **Key Metrics to Monitor**
|
||||
- **Hourly Success Rate**: Percentage of hourly checks that find new data
|
||||
- **Retry Duration**: How long system stays in retry mode
|
||||
- **Data Freshness**: Time since last successful data update
|
||||
- **API Response Time**: Performance of data fetching operations
|
||||
|
||||
### **Alert Conditions**
|
||||
- **Extended Retry Mode**: System in retry mode for > 30 minutes
|
||||
- **No Data for 2+ Hours**: No new data found for extended period
|
||||
- **High Error Rate**: Multiple consecutive API failures
|
||||
- **Database Issues**: Connection or save failures
|
||||
|
||||
### **Health Check Script**
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Check if system is stuck in retry mode
|
||||
RETRY_COUNT=$(tail -n 100 water_monitor.log | grep -c "RETRY CHECK")
|
||||
if [ $RETRY_COUNT -gt 6 ]; then
|
||||
echo "WARNING: System may be stuck in retry mode ($RETRY_COUNT retries in last 100 log entries)"
|
||||
fi
|
||||
|
||||
# Check data freshness
|
||||
LATEST_DATA=$(sqlite3 water_monitoring.db "SELECT MAX(timestamp) FROM water_measurements;")
|
||||
echo "Latest data timestamp: $LATEST_DATA"
|
||||
```
|
||||
|
||||
## 🎯 **Best Practices**
|
||||
|
||||
### **Production Deployment**
|
||||
1. **Monitor Logs**: Watch for retry mode patterns
|
||||
2. **Set Alerts**: Configure notifications for extended retry periods
|
||||
3. **Regular Maintenance**: Weekly gap filling and data validation
|
||||
4. **Backup Strategy**: Regular database backups before major operations
|
||||
|
||||
### **Performance Tuning**
|
||||
1. **Adjust Buffer Time**: Modify data availability buffer based on API patterns
|
||||
2. **Optimize Retry Interval**: Balance between responsiveness and API load
|
||||
3. **Database Indexing**: Ensure proper indexes for timestamp queries
|
||||
4. **Connection Pooling**: Configure appropriate database connection limits
|
||||
|
||||
This enhanced scheduler ensures reliable, efficient, and intelligent water level monitoring with automatic adaptation to data availability patterns.
|
||||
227
docs/ENHANCEMENT_SUMMARY.md
Normal file
227
docs/ENHANCEMENT_SUMMARY.md
Normal file
@@ -0,0 +1,227 @@
|
||||
# 🚀 Northern Thailand Ping River Monitor - Enhancement Summary
|
||||
|
||||
## 🎯 **What We've Accomplished**
|
||||
|
||||
We've successfully transformed your water monitoring system from a simple scraper into a **production-ready, enterprise-grade monitoring platform** focused on the Ping River Basin in Northern Thailand, with modern web interfaces, station management capabilities, and comprehensive observability.
|
||||
|
||||
## 🌟 **Major New Features Added**
|
||||
|
||||
### 1. **FastAPI Web Interface** 🌐
|
||||
- **Interactive Dashboard** at `http://localhost:8000`
|
||||
- **REST API** with comprehensive endpoints
|
||||
- **Station Management** - Add, update, delete monitoring stations
|
||||
- **Real-time Health Monitoring**
|
||||
- **Manual Data Collection Triggers**
|
||||
- **Interactive API Documentation** at `/docs`
|
||||
- **CORS Support** for web applications
|
||||
|
||||
### 2. **Enhanced Architecture** 🏗️
|
||||
- **Type Safety** with Pydantic models and comprehensive type hints
|
||||
- **Data Validation Layer** with range checking and error handling
|
||||
- **Custom Exception Classes** for better error management
|
||||
- **Modular Design** with separated concerns
|
||||
|
||||
### 3. **Observability & Monitoring** 📊
|
||||
- **Metrics Collection System** (counters, gauges, histograms)
|
||||
- **Health Checks** for database, API, and system resources
|
||||
- **Performance Tracking** with response times and success rates
|
||||
- **Enhanced Logging** with colors, rotation, and performance logs
|
||||
|
||||
### 4. **Production Features** 🚀
|
||||
- **Rate Limiting** to prevent API abuse
|
||||
- **Request Tracking** with detailed statistics
|
||||
- **Configuration Validation** on startup
|
||||
- **Graceful Error Handling** and recovery
|
||||
- **Background Task Management**
|
||||
|
||||
## 📁 **New Files Created**
|
||||
|
||||
```
|
||||
src/
|
||||
├── models.py # Data models and type definitions
|
||||
├── exceptions.py # Custom exception classes
|
||||
├── validators.py # Data validation layer
|
||||
├── metrics.py # Metrics collection system
|
||||
├── health_check.py # Health monitoring system
|
||||
├── rate_limiter.py # Rate limiting and request tracking
|
||||
├── logging_config.py # Enhanced logging configuration
|
||||
├── web_api.py # FastAPI web interface
|
||||
├── main.py # Enhanced CLI with multiple modes
|
||||
└── __init__.py # Package initialization
|
||||
|
||||
# Root files
|
||||
├── run.py # Simple startup script
|
||||
├── test_integration.py # Integration test suite
|
||||
├── test_api.py # API endpoint tests
|
||||
└── ENHANCEMENT_SUMMARY.md # This file
|
||||
```
|
||||
|
||||
## 🔧 **Enhanced Existing Files**
|
||||
|
||||
- **`src/water_scraper_v3.py`** - Integrated new features, metrics, validation
|
||||
- **`src/config.py`** - Added configuration validation
|
||||
- **`requirements.txt`** - Added FastAPI, Pydantic, and monitoring dependencies
|
||||
- **`docker-compose.victoriametrics.yml`** - Added web API service
|
||||
- **`Dockerfile`** - Updated for new startup script
|
||||
- **`README.md`** - Updated with new features and usage instructions
|
||||
|
||||
## 🌐 **Web API Endpoints**
|
||||
|
||||
| Endpoint | Method | Description |
|
||||
|----------|--------|-------------|
|
||||
| `/` | GET | Interactive dashboard |
|
||||
| `/docs` | GET | API documentation |
|
||||
| `/health` | GET | System health status |
|
||||
| `/metrics` | GET | Application metrics |
|
||||
| `/stations` | GET | List all monitoring stations |
|
||||
| `/measurements/latest` | GET | Latest measurements |
|
||||
| `/measurements/station/{code}` | GET | Station-specific data |
|
||||
| `/scrape/trigger` | POST | Trigger manual data collection |
|
||||
| `/scraping/status` | GET | Scraping status and statistics |
|
||||
| `/config` | GET | Current configuration (masked) |
|
||||
|
||||
## 🚀 **Usage Examples**
|
||||
|
||||
### **Traditional Mode (Enhanced)**
|
||||
```bash
|
||||
# Test single cycle
|
||||
python run.py --test
|
||||
|
||||
# Continuous monitoring
|
||||
python run.py
|
||||
|
||||
# Fill data gaps
|
||||
python run.py --fill-gaps 7
|
||||
|
||||
# Show system status
|
||||
python run.py --status
|
||||
```
|
||||
|
||||
### **Web API Mode (NEW!)**
|
||||
```bash
|
||||
# Start web API server
|
||||
python run.py --web-api
|
||||
|
||||
# Access dashboard
|
||||
open http://localhost:8000
|
||||
|
||||
# View API documentation
|
||||
open http://localhost:8000/docs
|
||||
```
|
||||
|
||||
### **Docker Deployment**
|
||||
```bash
|
||||
# Start complete stack
|
||||
docker-compose -f docker-compose.victoriametrics.yml up -d
|
||||
|
||||
# Services available:
|
||||
# - Water API: http://localhost:8000
|
||||
# - Grafana: http://localhost:3000
|
||||
# - VictoriaMetrics: http://localhost:8428
|
||||
```
|
||||
|
||||
## 📊 **Monitoring & Observability**
|
||||
|
||||
### **Built-in Metrics**
|
||||
- API request counts and response times
|
||||
- Database connection status and save operations
|
||||
- Scraping cycle success/failure rates
|
||||
- System resource usage (memory, etc.)
|
||||
|
||||
### **Health Checks**
|
||||
- Database connectivity and data freshness
|
||||
- External API availability
|
||||
- Memory usage monitoring
|
||||
- Overall system health status
|
||||
|
||||
### **Enhanced Logging**
|
||||
- Colored console output for better readability
|
||||
- File rotation to prevent disk space issues
|
||||
- Performance logging for optimization
|
||||
- Structured logging with proper levels
|
||||
|
||||
## 🔒 **Production Ready Features**
|
||||
|
||||
### **Security & Reliability**
|
||||
- Rate limiting to prevent API abuse
|
||||
- Input validation and sanitization
|
||||
- Graceful error handling and recovery
|
||||
- Configuration validation on startup
|
||||
|
||||
### **Performance**
|
||||
- Efficient metrics collection with minimal overhead
|
||||
- Background task management
|
||||
- Connection pooling and resource management
|
||||
- Optimized database operations
|
||||
|
||||
### **Scalability**
|
||||
- Modular architecture for easy extension
|
||||
- Async support for high concurrency
|
||||
- Configurable resource limits
|
||||
- Health checks for load balancer integration
|
||||
|
||||
## 🧪 **Testing**
|
||||
|
||||
### **Integration Tests**
|
||||
```bash
|
||||
# Run all integration tests
|
||||
python test_integration.py
|
||||
```
|
||||
|
||||
### **API Tests**
|
||||
```bash
|
||||
# Test API endpoints (server must be running)
|
||||
python test_api.py
|
||||
```
|
||||
|
||||
## 📈 **Performance Improvements**
|
||||
|
||||
1. **Request Tracking** - Monitor API performance and success rates
|
||||
2. **Rate Limiting** - Prevent API abuse and ensure stability
|
||||
3. **Data Validation** - Catch errors early and improve data quality
|
||||
4. **Metrics Collection** - Identify bottlenecks and optimization opportunities
|
||||
5. **Health Monitoring** - Proactive issue detection and alerting
|
||||
|
||||
## 🎉 **Benefits Achieved**
|
||||
|
||||
### **For Developers**
|
||||
- **Better Developer Experience** with type hints and validation
|
||||
- **Easier Debugging** with enhanced logging and error messages
|
||||
- **Comprehensive Testing** with integration and API tests
|
||||
- **Modern Architecture** following best practices
|
||||
|
||||
### **For Operations**
|
||||
- **Web Dashboard** for easy monitoring and management
|
||||
- **Health Checks** for automated monitoring integration
|
||||
- **Metrics Collection** for performance analysis
|
||||
- **Production-Ready** deployment with Docker support
|
||||
|
||||
### **For Users**
|
||||
- **REST API** for integration with other systems
|
||||
- **Real-time Data Access** via web interface
|
||||
- **Manual Controls** for triggering data collection
|
||||
- **Status Monitoring** for system visibility
|
||||
|
||||
## 🔮 **Future Enhancement Opportunities**
|
||||
|
||||
1. **Authentication & Authorization** - Add user management and API keys
|
||||
2. **Real-time WebSocket Updates** - Live data streaming to web clients
|
||||
3. **Advanced Analytics** - Trend analysis and forecasting
|
||||
4. **Alert System** - Email/SMS notifications for critical conditions
|
||||
5. **Multi-tenant Support** - Support for multiple organizations
|
||||
6. **Data Export** - CSV, Excel, and other format exports
|
||||
7. **Mobile App** - React Native or Flutter mobile interface
|
||||
|
||||
## 🏆 **Summary**
|
||||
|
||||
Your Thailand Water Monitor has been transformed from a simple data scraper into a **comprehensive, enterprise-grade monitoring platform** that includes:
|
||||
|
||||
- ✅ **Modern Web Interface** with FastAPI
|
||||
- ✅ **Production-Ready Architecture** with proper error handling
|
||||
- ✅ **Comprehensive Monitoring** with metrics and health checks
|
||||
- ✅ **Type Safety** and data validation
|
||||
- ✅ **Enhanced Logging** and observability
|
||||
- ✅ **Docker Support** for easy deployment
|
||||
- ✅ **Extensive Testing** for reliability
|
||||
|
||||
The system is now ready for production deployment and can serve as a foundation for further enhancements and integrations!
|
||||
275
docs/GAP_FILLING_GUIDE.md
Normal file
275
docs/GAP_FILLING_GUIDE.md
Normal file
@@ -0,0 +1,275 @@
|
||||
# Gap Filling and Data Integrity Guide
|
||||
|
||||
This guide explains the enhanced gap-filling functionality that addresses data gaps and missing timestamps in the Thailand Water Monitor.
|
||||
|
||||
## ✅ **Issues Resolved**
|
||||
|
||||
### **1. Data Gaps Problem**
|
||||
- **Before**: Tool only fetched current day data, leaving gaps in historical records
|
||||
- **After**: Automatically detects and fills missing timestamps for the last 7 days
|
||||
|
||||
### **2. Missing Midnight Timestamps**
|
||||
- **Before**: Jump from 23:00 to 01:00 (missing 00:00 midnight data)
|
||||
- **After**: Specifically checks for and fills midnight hour gaps
|
||||
|
||||
### **3. Changed Values**
|
||||
- **Before**: No mechanism to update existing data if values changed on the server
|
||||
- **After**: Compares existing data with fresh API data and updates changed values
|
||||
|
||||
## 🔧 **New Features**
|
||||
|
||||
### **Command Line Interface**
|
||||
```bash
|
||||
# Check for missing data gaps
|
||||
python water_scraper_v3.py --check-gaps [days]
|
||||
|
||||
# Fill missing data gaps
|
||||
python water_scraper_v3.py --fill-gaps [days]
|
||||
|
||||
# Update existing data with latest values
|
||||
python water_scraper_v3.py --update-data [days]
|
||||
|
||||
# Run single test cycle
|
||||
python water_scraper_v3.py --test
|
||||
|
||||
# Show help
|
||||
python water_scraper_v3.py --help
|
||||
```
|
||||
|
||||
### **Automatic Gap Detection**
|
||||
The system now automatically:
|
||||
- Generates expected hourly timestamps for the specified time range
|
||||
- Compares with existing database records
|
||||
- Identifies missing timestamps
|
||||
- Groups missing data by date for efficient API calls
|
||||
|
||||
### **Intelligent Gap Filling**
|
||||
- **Historical Data Fetching**: Retrieves data for specific dates to fill gaps
|
||||
- **Selective Insertion**: Only inserts data for actually missing timestamps
|
||||
- **API Rate Limiting**: Includes delays between API calls to be respectful
|
||||
- **Error Handling**: Continues processing even if some dates fail
|
||||
|
||||
### **Data Update Mechanism**
|
||||
- **Change Detection**: Compares water levels, discharge rates, and percentages
|
||||
- **Precision Checking**: Uses appropriate thresholds (0.001m for water level, 0.1 cms for discharge)
|
||||
- **Selective Updates**: Only updates records where values have actually changed
|
||||
|
||||
## 📊 **Test Results**
|
||||
|
||||
### **Before Enhancement**
|
||||
```
|
||||
Found 22 missing timestamps in the last 2 days:
|
||||
2025-07-23: Missing hours [9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]
|
||||
2025-07-24: Missing hours [0, 20, 21, 22, 23]
|
||||
2025-07-25: Missing hours [0, 9]
|
||||
```
|
||||
|
||||
### **After Gap Filling**
|
||||
```
|
||||
Gap filling completed. Filled 96 missing data points
|
||||
|
||||
Remaining gaps:
|
||||
2025-07-24: Missing hours [10]
|
||||
2025-07-25: Missing hours [0, 10]
|
||||
```
|
||||
|
||||
**Improvement**: Reduced from 22 missing timestamps to 3 (86% improvement)
|
||||
|
||||
## 🚀 **Enhanced Scraping Cycle**
|
||||
|
||||
The regular scraping cycle now includes three phases:
|
||||
|
||||
### **Phase 1: Current Data Collection**
|
||||
```python
|
||||
# Fetch and save current data
|
||||
water_data = self.fetch_water_data()
|
||||
success = self.save_to_database(water_data)
|
||||
```
|
||||
|
||||
### **Phase 2: Gap Filling (Last 7 Days)**
|
||||
```python
|
||||
# Check for and fill missing data
|
||||
filled_count = self.fill_data_gaps(days_back=7)
|
||||
```
|
||||
|
||||
### **Phase 3: Data Updates (Last 2 Days)**
|
||||
```python
|
||||
# Update existing data with latest values
|
||||
updated_count = self.update_existing_data(days_back=2)
|
||||
```
|
||||
|
||||
## 🔧 **Technical Improvements**
|
||||
|
||||
### **Database Connection Handling**
|
||||
- **SQLite Optimization**: Added timeout and thread safety parameters
|
||||
- **Retry Logic**: Exponential backoff for database lock errors
|
||||
- **Transaction Management**: Proper use of `engine.begin()` for automatic commits
|
||||
|
||||
### **Error Recovery**
|
||||
```python
|
||||
# Retry logic with exponential backoff
|
||||
for attempt in range(max_retries):
|
||||
try:
|
||||
success = self.db_adapter.save_measurements(water_data)
|
||||
if success:
|
||||
return True
|
||||
except Exception as e:
|
||||
if "database is locked" in str(e).lower():
|
||||
time.sleep(2 ** attempt) # 1s, 2s, 4s delays
|
||||
continue
|
||||
```
|
||||
|
||||
### **Memory Efficiency**
|
||||
- **Selective Data Processing**: Only processes data for missing timestamps
|
||||
- **Batch Processing**: Groups operations by date to minimize API calls
|
||||
- **Resource Management**: Proper cleanup and connection handling
|
||||
|
||||
## 📋 **Usage Examples**
|
||||
|
||||
### **Daily Maintenance**
|
||||
```bash
|
||||
# Check for gaps in the last week
|
||||
python water_scraper_v3.py --check-gaps 7
|
||||
|
||||
# Fill any found gaps
|
||||
python water_scraper_v3.py --fill-gaps 7
|
||||
|
||||
# Update recent data for accuracy
|
||||
python water_scraper_v3.py --update-data 2
|
||||
```
|
||||
|
||||
### **Historical Data Recovery**
|
||||
```bash
|
||||
# Check for gaps in the last month
|
||||
python water_scraper_v3.py --check-gaps 30
|
||||
|
||||
# Fill gaps for the last month (be patient, this takes time)
|
||||
python water_scraper_v3.py --fill-gaps 30
|
||||
```
|
||||
|
||||
### **Production Monitoring**
|
||||
```bash
|
||||
# Quick test to ensure system is working
|
||||
python water_scraper_v3.py --test
|
||||
|
||||
# Check for recent gaps
|
||||
python water_scraper_v3.py --check-gaps 1
|
||||
```
|
||||
|
||||
## 🔍 **Monitoring and Alerts**
|
||||
|
||||
### **Gap Detection Output**
|
||||
```
|
||||
Found 22 missing timestamps:
|
||||
2025-07-23: Missing hours [9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]
|
||||
2025-07-24: Missing hours [0, 20, 21, 22, 23]
|
||||
2025-07-25: Missing hours [0, 9]
|
||||
```
|
||||
|
||||
### **Gap Filling Progress**
|
||||
```
|
||||
Fetching data for 2025-07-24 to fill 5 missing timestamps
|
||||
Successfully fetched 368 data points from API for 2025-07-24
|
||||
Filled 80 data points for 2025-07-24
|
||||
Gap filling completed. Filled 96 missing data points
|
||||
```
|
||||
|
||||
### **Update Detection**
|
||||
```
|
||||
Checking for updates on 2025-07-24
|
||||
Update needed for P.1 at 2025-07-24 15:00:00
|
||||
Updated 5 measurements for 2025-07-24
|
||||
Data update completed. Updated 5 measurements
|
||||
```
|
||||
|
||||
## ⚙️ **Configuration Options**
|
||||
|
||||
### **Environment Variables**
|
||||
```bash
|
||||
# Database configuration
|
||||
export DB_TYPE=sqlite
|
||||
export WATER_DB_PATH=water_monitoring.db
|
||||
|
||||
# Gap filling settings (can be added to config.py)
|
||||
export GAP_FILL_DAYS=7 # Days to check for gaps
|
||||
export UPDATE_DAYS=2 # Days to check for updates
|
||||
export API_DELAY=1 # Seconds between API calls
|
||||
export MAX_RETRIES=3 # Database retry attempts
|
||||
```
|
||||
|
||||
### **Customizable Parameters**
|
||||
- **Gap Check Period**: Default 7 days, configurable via command line
|
||||
- **Update Period**: Default 2 days, configurable via command line
|
||||
- **API Rate Limiting**: 1-second delay between calls (configurable)
|
||||
- **Retry Logic**: 3 attempts with exponential backoff (configurable)
|
||||
|
||||
## 🛠️ **Troubleshooting**
|
||||
|
||||
### **Common Issues**
|
||||
|
||||
#### **Database Locked Errors**
|
||||
```
|
||||
ERROR - Error saving to SQLITE: database is locked
|
||||
```
|
||||
**Solution**: The retry logic now handles this automatically with exponential backoff.
|
||||
|
||||
#### **API Rate Limiting**
|
||||
```
|
||||
WARNING - Too many requests to API
|
||||
```
|
||||
**Solution**: Increase delay between API calls or reduce the number of days processed at once.
|
||||
|
||||
#### **Missing Data Still Present**
|
||||
```
|
||||
Found X missing timestamps after gap filling
|
||||
```
|
||||
**Possible Causes**:
|
||||
- Data not available on the Thai government server for those timestamps
|
||||
- Network issues during API calls
|
||||
- API returned empty data for those specific times
|
||||
|
||||
### **Debug Commands**
|
||||
```bash
|
||||
# Enable debug logging
|
||||
export LOG_LEVEL=DEBUG
|
||||
python water_scraper_v3.py --check-gaps 1
|
||||
|
||||
# Test specific date range
|
||||
python water_scraper_v3.py --fill-gaps 1
|
||||
|
||||
# Check database directly
|
||||
sqlite3 water_monitoring.db "SELECT COUNT(*) FROM water_measurements;"
|
||||
sqlite3 water_monitoring.db "SELECT timestamp, COUNT(*) FROM water_measurements GROUP BY timestamp ORDER BY timestamp DESC LIMIT 10;"
|
||||
```
|
||||
|
||||
## 📈 **Performance Metrics**
|
||||
|
||||
### **Gap Filling Efficiency**
|
||||
- **API Calls**: Grouped by date to minimize requests
|
||||
- **Processing Speed**: ~100-400 data points per API call
|
||||
- **Success Rate**: 86% gap reduction in test case
|
||||
- **Resource Usage**: Minimal memory footprint with selective processing
|
||||
|
||||
### **Database Performance**
|
||||
- **SQLite Optimization**: Connection pooling and timeout handling
|
||||
- **Transaction Efficiency**: Batch inserts with proper transaction management
|
||||
- **Retry Success**: Automatic recovery from temporary lock conditions
|
||||
|
||||
## 🎯 **Best Practices**
|
||||
|
||||
### **Regular Maintenance**
|
||||
1. **Daily**: Run `--check-gaps 1` to monitor recent data quality
|
||||
2. **Weekly**: Run `--fill-gaps 7` to catch any missed data
|
||||
3. **Monthly**: Run `--update-data 7` to ensure data accuracy
|
||||
|
||||
### **Production Deployment**
|
||||
1. **Automated Scheduling**: Use cron or systemd timers for regular gap checks
|
||||
2. **Monitoring**: Set up alerts for excessive missing data
|
||||
3. **Backup**: Regular database backups before major gap-filling operations
|
||||
|
||||
### **Data Quality Assurance**
|
||||
1. **Validation**: Check for reasonable value ranges after gap filling
|
||||
2. **Comparison**: Compare filled data with nearby timestamps for consistency
|
||||
3. **Documentation**: Log all gap-filling activities for audit trails
|
||||
|
||||
This enhanced gap-filling system ensures comprehensive and accurate water level monitoring with minimal data loss and automatic recovery capabilities.
|
||||
475
docs/GEOLOCATION_GUIDE.md
Normal file
475
docs/GEOLOCATION_GUIDE.md
Normal file
@@ -0,0 +1,475 @@
|
||||
# Geolocation Support for Grafana Geomap
|
||||
|
||||
This guide explains the geolocation functionality added to the Thailand Water Monitor for use with Grafana's geomap visualization.
|
||||
|
||||
## ✅ **Implemented Features**
|
||||
|
||||
### **Database Schema Updates**
|
||||
All database adapters now support geolocation fields:
|
||||
- **latitude**: Decimal latitude coordinates (DECIMAL(10,8) for SQL, REAL for SQLite)
|
||||
- **longitude**: Decimal longitude coordinates (DECIMAL(11,8) for SQL, REAL for SQLite)
|
||||
- **geohash**: Geohash string for efficient spatial indexing (VARCHAR(20)/TEXT)
|
||||
|
||||
### **Station Data Enhancement**
|
||||
Station mapping now includes geolocation fields:
|
||||
```python
|
||||
'8': {
|
||||
'code': 'P.1',
|
||||
'thai_name': 'สะพานนวรัฐ',
|
||||
'english_name': 'Nawarat Bridge',
|
||||
'latitude': 15.6944, # Decimal degrees
|
||||
'longitude': 100.2028, # Decimal degrees
|
||||
'geohash': 'w5q6uuhvfcfp25' # Geohash for P.1
|
||||
}
|
||||
```
|
||||
|
||||
## 🗄️ **Database Schema**
|
||||
|
||||
### **Updated Stations Table**
|
||||
```sql
|
||||
CREATE TABLE stations (
|
||||
id INTEGER PRIMARY KEY,
|
||||
station_code TEXT UNIQUE NOT NULL,
|
||||
thai_name TEXT NOT NULL,
|
||||
english_name TEXT NOT NULL,
|
||||
latitude REAL, -- NEW: Latitude coordinate
|
||||
longitude REAL, -- NEW: Longitude coordinate
|
||||
geohash TEXT, -- NEW: Geohash for spatial indexing
|
||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||||
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
|
||||
);
|
||||
```
|
||||
|
||||
### **Database Support**
|
||||
- ✅ **SQLite**: REAL columns for coordinates, TEXT for geohash
|
||||
- ✅ **PostgreSQL**: DECIMAL(10,8) and DECIMAL(11,8) for coordinates, VARCHAR(20) for geohash
|
||||
- ✅ **MySQL**: DECIMAL(10,8) and DECIMAL(11,8) for coordinates, VARCHAR(20) for geohash
|
||||
- ✅ **VictoriaMetrics**: Geolocation data included in metric labels
|
||||
|
||||
## 📊 **Current Station Data**
|
||||
|
||||
### **P.1 - Nawarat Bridge (Sample)**
|
||||
- **Station Code**: P.1
|
||||
- **Thai Name**: สะพานนวรัฐ
|
||||
- **English Name**: Nawarat Bridge
|
||||
- **Latitude**: 15.6944
|
||||
- **Longitude**: 100.2028
|
||||
- **Geohash**: w5q6uuhvfcfp25
|
||||
|
||||
### **Remaining Stations**
|
||||
The following stations are ready for geolocation data when coordinates become available:
|
||||
- P.20 - บ้านเชียงดาว (Ban Chiang Dao)
|
||||
- P.75 - บ้านช่อแล (Ban Chai Lat)
|
||||
- P.92 - บ้านเมืองกึ๊ด (Ban Muang Aut)
|
||||
- P.4A - บ้านแม่แตง (Ban Mae Taeng)
|
||||
- P.67 - บ้านแม่แต (Ban Tae)
|
||||
- P.21 - บ้านริมใต้ (Ban Rim Tai)
|
||||
- P.103 - สะพานวงแหวนรอบ 3 (Ring Bridge 3)
|
||||
- P.82 - บ้านสบวิน (Ban Sob win)
|
||||
- P.84 - บ้านพันตน (Ban Panton)
|
||||
- P.81 - บ้านโป่ง (Ban Pong)
|
||||
- P.5 - สะพานท่านาง (Tha Nang Bridge)
|
||||
- P.77 - บ้านสบแม่สะป๊วด (Baan Sop Mae Sapuord)
|
||||
- P.87 - บ้านป่าซาง (Ban Pa Sang)
|
||||
- P.76 - บ้านแม่อีไฮ (Banb Mae I Hai)
|
||||
- P.85 - บ้านหล่ายแก้ว (Baan Lai Kaew)
|
||||
|
||||
## 🗺️ **Grafana Geomap Integration**
|
||||
|
||||
### **Data Source Configuration**
|
||||
The geolocation data is automatically included in all database queries and can be used directly in Grafana:
|
||||
|
||||
#### **SQLite/PostgreSQL/MySQL Query Example**
|
||||
```sql
|
||||
SELECT
|
||||
m.timestamp,
|
||||
s.station_code,
|
||||
s.english_name,
|
||||
s.thai_name,
|
||||
s.latitude,
|
||||
s.longitude,
|
||||
s.geohash,
|
||||
m.water_level,
|
||||
m.discharge,
|
||||
m.discharge_percent
|
||||
FROM water_measurements m
|
||||
JOIN stations s ON m.station_id = s.id
|
||||
WHERE s.latitude IS NOT NULL
|
||||
AND s.longitude IS NOT NULL
|
||||
ORDER BY m.timestamp DESC
|
||||
```
|
||||
|
||||
#### **VictoriaMetrics Query Example**
|
||||
```promql
|
||||
water_level{latitude!="",longitude!=""}
|
||||
```
|
||||
|
||||
### **Geomap Panel Configuration**
|
||||
|
||||
#### **1. Create Geomap Panel**
|
||||
1. Add new panel in Grafana
|
||||
2. Select "Geomap" visualization
|
||||
3. Configure data source (SQLite/PostgreSQL/MySQL/VictoriaMetrics)
|
||||
|
||||
#### **2. Configure Location Fields**
|
||||
- **Latitude Field**: `latitude`
|
||||
- **Longitude Field**: `longitude`
|
||||
- **Alternative**: Use `geohash` field for geohash-based positioning
|
||||
|
||||
#### **3. Configure Display Options**
|
||||
- **Station Labels**: Use `station_code` or `english_name`
|
||||
- **Tooltip Information**: Include `thai_name`, `water_level`, `discharge`
|
||||
- **Color Mapping**: Map to `water_level` or `discharge_percent`
|
||||
|
||||
#### **4. Sample Geomap Configuration**
|
||||
```json
|
||||
{
|
||||
"type": "geomap",
|
||||
"title": "Thailand Water Stations",
|
||||
"targets": [
|
||||
{
|
||||
"rawSql": "SELECT latitude, longitude, station_code, english_name, water_level, discharge_percent FROM stations s JOIN water_measurements m ON s.id = m.station_id WHERE s.latitude IS NOT NULL AND m.timestamp = (SELECT MAX(timestamp) FROM water_measurements WHERE station_id = s.id)",
|
||||
"format": "table"
|
||||
}
|
||||
],
|
||||
"fieldConfig": {
|
||||
"defaults": {
|
||||
"custom": {
|
||||
"hideFrom": {
|
||||
"legend": false,
|
||||
"tooltip": false,
|
||||
"vis": false
|
||||
}
|
||||
},
|
||||
"mappings": [],
|
||||
"color": {
|
||||
"mode": "continuous-GrYlRd",
|
||||
"field": "water_level"
|
||||
}
|
||||
}
|
||||
},
|
||||
"options": {
|
||||
"view": {
|
||||
"id": "coords",
|
||||
"lat": 15.6944,
|
||||
"lon": 100.2028,
|
||||
"zoom": 8
|
||||
},
|
||||
"controls": {
|
||||
"mouseWheelZoom": true,
|
||||
"showZoom": true,
|
||||
"showAttribution": true
|
||||
},
|
||||
"layers": [
|
||||
{
|
||||
"type": "markers",
|
||||
"config": {
|
||||
"size": {
|
||||
"field": "discharge_percent",
|
||||
"min": 5,
|
||||
"max": 20
|
||||
},
|
||||
"color": {
|
||||
"field": "water_level"
|
||||
},
|
||||
"showLegend": true
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 🔧 **Adding New Station Coordinates**
|
||||
|
||||
### **Method 1: Update Station Mapping**
|
||||
Edit `water_scraper_v3.py` and add coordinates to the station mapping:
|
||||
```python
|
||||
'1': {
|
||||
'code': 'P.20',
|
||||
'thai_name': 'บ้านเชียงดาว',
|
||||
'english_name': 'Ban Chiang Dao',
|
||||
'latitude': 19.3056, # Add actual coordinates
|
||||
'longitude': 98.9264, # Add actual coordinates
|
||||
'geohash': 'w4r6...' # Add actual geohash
|
||||
}
|
||||
```
|
||||
|
||||
### **Method 2: Direct Database Update**
|
||||
```sql
|
||||
UPDATE stations
|
||||
SET latitude = 19.3056, longitude = 98.9264, geohash = 'w4r6uuhvfcfp25'
|
||||
WHERE station_code = 'P.20';
|
||||
```
|
||||
|
||||
### **Method 3: Bulk Update Script**
|
||||
```python
|
||||
import sqlite3
|
||||
|
||||
coordinates = {
|
||||
'P.20': {'lat': 19.3056, 'lon': 98.9264, 'geohash': 'w4r6uuhvfcfp25'},
|
||||
'P.75': {'lat': 18.7756, 'lon': 99.1234, 'geohash': 'w4r5uuhvfcfp25'},
|
||||
# Add more stations...
|
||||
}
|
||||
|
||||
conn = sqlite3.connect('water_monitoring.db')
|
||||
cursor = conn.cursor()
|
||||
|
||||
for station_code, coords in coordinates.items():
|
||||
cursor.execute("""
|
||||
UPDATE stations
|
||||
SET latitude = ?, longitude = ?, geohash = ?
|
||||
WHERE station_code = ?
|
||||
""", (coords['lat'], coords['lon'], coords['geohash'], station_code))
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
```
|
||||
|
||||
## 🌐 **Geohash Information**
|
||||
|
||||
### **What is Geohash?**
|
||||
Geohash is a geocoding system that represents geographic coordinates as a short alphanumeric string. It provides:
|
||||
- **Spatial Indexing**: Efficient spatial queries
|
||||
- **Proximity**: Similar geohashes indicate nearby locations
|
||||
- **Hierarchical**: Longer geohashes provide more precision
|
||||
|
||||
### **Geohash Precision Levels**
|
||||
- **5 characters**: ~2.4km precision
|
||||
- **6 characters**: ~610m precision
|
||||
- **7 characters**: ~76m precision
|
||||
- **8 characters**: ~19m precision
|
||||
- **9+ characters**: <5m precision
|
||||
|
||||
### **Example: P.1 Geohash**
|
||||
- **Geohash**: `w5q6uuhvfcfp25`
|
||||
- **Length**: 14 characters
|
||||
- **Precision**: Sub-meter accuracy
|
||||
- **Location**: Nawarat Bridge, Thailand
|
||||
|
||||
## 📈 **Grafana Visualization Examples**
|
||||
|
||||
### **1. Station Location Map**
|
||||
- **Type**: Geomap with markers
|
||||
- **Data**: Current station locations
|
||||
- **Color**: Water level or discharge percentage
|
||||
- **Size**: Discharge volume
|
||||
|
||||
### **2. Regional Water Levels**
|
||||
- **Type**: Geomap with heatmap
|
||||
- **Data**: Water level data across regions
|
||||
- **Visualization**: Color-coded intensity map
|
||||
- **Filters**: Time range, station groups
|
||||
|
||||
### **3. Alert Zones**
|
||||
- **Type**: Geomap with threshold markers
|
||||
- **Data**: Stations exceeding alert thresholds
|
||||
- **Visualization**: Red markers for high water levels
|
||||
- **Alerts**: Automated notifications for critical levels
|
||||
|
||||
## 🔄 **Updating a Running System**
|
||||
|
||||
### **Automated Migration Script**
|
||||
Use the provided migration script to safely add geolocation columns to your existing database:
|
||||
|
||||
```bash
|
||||
# Stop the water monitoring service first
|
||||
sudo systemctl stop water-monitor
|
||||
|
||||
# Run the migration script
|
||||
python migrate_geolocation.py
|
||||
|
||||
# Restart the service
|
||||
sudo systemctl start water-monitor
|
||||
```
|
||||
|
||||
### **Migration Script Features**
|
||||
- ✅ **Auto-detects database type** from environment variables
|
||||
- ✅ **Checks existing columns** to avoid conflicts
|
||||
- ✅ **Supports all database types** (SQLite, PostgreSQL, MySQL)
|
||||
- ✅ **Adds sample data** for P.1 station
|
||||
- ✅ **Safe operation** - won't break existing data
|
||||
|
||||
### **Step-by-Step Migration Process**
|
||||
|
||||
#### **1. Stop the Application**
|
||||
```bash
|
||||
# If running as systemd service
|
||||
sudo systemctl stop water-monitor
|
||||
|
||||
# If running in screen/tmux
|
||||
# Use Ctrl+C to stop the process
|
||||
|
||||
# If running as Docker container
|
||||
docker stop water-monitor
|
||||
```
|
||||
|
||||
#### **2. Backup Your Database**
|
||||
```bash
|
||||
# SQLite backup
|
||||
cp water_monitoring.db water_monitoring.db.backup
|
||||
|
||||
# PostgreSQL backup
|
||||
pg_dump water_monitoring > water_monitoring_backup.sql
|
||||
|
||||
# MySQL backup
|
||||
mysqldump water_monitoring > water_monitoring_backup.sql
|
||||
```
|
||||
|
||||
#### **3. Run Migration Script**
|
||||
```bash
|
||||
# Default (uses environment variables)
|
||||
python migrate_geolocation.py
|
||||
|
||||
# Or specify database path for SQLite
|
||||
SQLITE_DB_PATH=/path/to/water_monitoring.db python migrate_geolocation.py
|
||||
```
|
||||
|
||||
#### **4. Verify Migration**
|
||||
```bash
|
||||
# Check SQLite schema
|
||||
sqlite3 water_monitoring.db ".schema stations"
|
||||
|
||||
# Check PostgreSQL schema
|
||||
psql -d water_monitoring -c "\d stations"
|
||||
|
||||
# Check MySQL schema
|
||||
mysql -e "DESCRIBE water_monitoring.stations"
|
||||
```
|
||||
|
||||
#### **5. Update Application Code**
|
||||
Ensure you have the latest version of the application with geolocation support:
|
||||
```bash
|
||||
# Pull latest code
|
||||
git pull origin main
|
||||
|
||||
# Install any new dependencies
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
#### **6. Restart Application**
|
||||
```bash
|
||||
# Systemd service
|
||||
sudo systemctl start water-monitor
|
||||
|
||||
# Docker container
|
||||
docker start water-monitor
|
||||
|
||||
# Manual execution
|
||||
python water_scraper_v3.py
|
||||
```
|
||||
|
||||
### **Migration Output Example**
|
||||
```
|
||||
2025-07-28 17:30:00,123 - INFO - Starting geolocation column migration...
|
||||
2025-07-28 17:30:00,124 - INFO - Detected database type: SQLITE
|
||||
2025-07-28 17:30:00,125 - INFO - Migrating SQLite database: water_monitoring.db
|
||||
2025-07-28 17:30:00,126 - INFO - Current columns in stations table: ['id', 'station_code', 'thai_name', 'english_name', 'created_at', 'updated_at']
|
||||
2025-07-28 17:30:00,127 - INFO - Added latitude column
|
||||
2025-07-28 17:30:00,128 - INFO - Added longitude column
|
||||
2025-07-28 17:30:00,129 - INFO - Added geohash column
|
||||
2025-07-28 17:30:00,130 - INFO - Successfully added columns: latitude, longitude, geohash
|
||||
2025-07-28 17:30:00,131 - INFO - Updated P.1 station with sample geolocation data
|
||||
2025-07-28 17:30:00,132 - INFO - P.1 station geolocation: ('P.1', 15.6944, 100.2028, 'w5q6uuhvfcfp25')
|
||||
2025-07-28 17:30:00,133 - INFO - ✅ Migration completed successfully!
|
||||
2025-07-28 17:30:00,134 - INFO - You can now restart your water monitoring application
|
||||
2025-07-28 17:30:00,135 - INFO - The system will automatically use the new geolocation columns
|
||||
```
|
||||
|
||||
## 🔍 **Troubleshooting**
|
||||
|
||||
### **Migration Issues**
|
||||
|
||||
#### **Database Locked Error**
|
||||
```bash
|
||||
# Stop all processes using the database
|
||||
sudo systemctl stop water-monitor
|
||||
pkill -f water_scraper
|
||||
|
||||
# Wait a few seconds, then run migration
|
||||
sleep 5
|
||||
python migrate_geolocation.py
|
||||
```
|
||||
|
||||
#### **Permission Denied**
|
||||
```bash
|
||||
# Check database file permissions
|
||||
ls -la water_monitoring.db
|
||||
|
||||
# Fix permissions if needed
|
||||
sudo chown $USER:$USER water_monitoring.db
|
||||
chmod 664 water_monitoring.db
|
||||
```
|
||||
|
||||
#### **Missing Dependencies**
|
||||
```bash
|
||||
# For PostgreSQL
|
||||
pip install psycopg2-binary
|
||||
|
||||
# For MySQL
|
||||
pip install pymysql
|
||||
|
||||
# For all databases
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
### **Verification Issues**
|
||||
|
||||
#### **Missing Coordinates**
|
||||
If stations don't appear on the geomap:
|
||||
1. Check if latitude/longitude are NULL in database
|
||||
2. Verify geolocation data in station mapping
|
||||
3. Ensure database schema includes geolocation columns
|
||||
4. Run migration script if columns are missing
|
||||
|
||||
#### **Incorrect Positioning**
|
||||
If stations appear in wrong locations:
|
||||
1. Verify coordinate format (decimal degrees)
|
||||
2. Check latitude/longitude order (lat first, lon second)
|
||||
3. Validate geohash accuracy
|
||||
|
||||
### **Rollback Procedure**
|
||||
If migration causes issues:
|
||||
|
||||
#### **SQLite Rollback**
|
||||
```bash
|
||||
# Stop application
|
||||
sudo systemctl stop water-monitor
|
||||
|
||||
# Restore backup
|
||||
cp water_monitoring.db.backup water_monitoring.db
|
||||
|
||||
# Restart with old version
|
||||
sudo systemctl start water-monitor
|
||||
```
|
||||
|
||||
#### **PostgreSQL Rollback**
|
||||
```sql
|
||||
-- Remove added columns
|
||||
ALTER TABLE stations DROP COLUMN IF EXISTS latitude;
|
||||
ALTER TABLE stations DROP COLUMN IF EXISTS longitude;
|
||||
ALTER TABLE stations DROP COLUMN IF EXISTS geohash;
|
||||
```
|
||||
|
||||
#### **MySQL Rollback**
|
||||
```sql
|
||||
-- Remove added columns
|
||||
ALTER TABLE stations DROP COLUMN latitude;
|
||||
ALTER TABLE stations DROP COLUMN longitude;
|
||||
ALTER TABLE stations DROP COLUMN geohash;
|
||||
```
|
||||
|
||||
## 🎯 **Next Steps**
|
||||
|
||||
### **Immediate Actions**
|
||||
1. **Gather Coordinates**: Collect GPS coordinates for all 16 stations
|
||||
2. **Update Database**: Add coordinates to remaining stations
|
||||
3. **Create Dashboards**: Build Grafana geomap visualizations
|
||||
|
||||
### **Future Enhancements**
|
||||
1. **Automatic Geocoding**: API integration for address-to-coordinate conversion
|
||||
2. **Mobile GPS**: Mobile app for field coordinate collection
|
||||
3. **Satellite Integration**: Satellite imagery overlay in Grafana
|
||||
4. **Geofencing**: Alert zones based on geographic boundaries
|
||||
|
||||
The geolocation functionality is now fully implemented and ready for use with Grafana's geomap visualization. Station P.1 (Nawarat Bridge) serves as a working example with complete coordinate data.
|
||||
295
docs/GITEA_WORKFLOWS.md
Normal file
295
docs/GITEA_WORKFLOWS.md
Normal file
@@ -0,0 +1,295 @@
|
||||
# 🔄 Gitea Actions Workflows - Northern Thailand Ping River Monitor
|
||||
|
||||
## 📋 Overview
|
||||
|
||||
This document describes the Gitea Actions workflows configured for the Northern Thailand Ping River Monitor project. These workflows provide comprehensive CI/CD, security scanning, and documentation generation.
|
||||
|
||||
## 🚀 Available Workflows
|
||||
|
||||
### 1. **CI/CD Pipeline** (`.gitea/workflows/ci.yml`)
|
||||
|
||||
**Triggers:**
|
||||
- Push to `main` or `develop` branches
|
||||
- Pull requests to `main`
|
||||
- Daily scheduled runs at 2 AM UTC
|
||||
|
||||
**Jobs:**
|
||||
- **Test Suite**: Multi-version Python testing (3.9-3.12)
|
||||
- **Code Quality**: Linting, formatting, and type checking
|
||||
- **Build**: Docker image creation and testing
|
||||
- **Integration Test**: Testing with VictoriaMetrics service
|
||||
- **Deploy Staging**: Automatic deployment to staging (develop branch)
|
||||
- **Deploy Production**: Manual deployment to production (main branch)
|
||||
- **Performance Test**: Load testing after production deployment
|
||||
|
||||
**Key Features:**
|
||||
- ✅ Multi-Python version testing
|
||||
- ✅ Docker multi-architecture builds (amd64, arm64)
|
||||
- ✅ Service integration testing
|
||||
- ✅ Automatic staging deployment
|
||||
- ✅ Manual production approval
|
||||
- ✅ Performance validation
|
||||
|
||||
### 2. **Security & Dependency Updates** (`.gitea/workflows/security.yml`)
|
||||
|
||||
**Triggers:**
|
||||
- Daily scheduled runs at 3 AM UTC
|
||||
- Manual dispatch
|
||||
- Changes to requirements files or Dockerfile
|
||||
|
||||
**Jobs:**
|
||||
- **Dependency Scan**: Safety, Bandit, Semgrep security scans
|
||||
- **Docker Security**: Trivy vulnerability scanning
|
||||
- **License Check**: License compliance verification
|
||||
- **Dependency Updates**: Automated update detection
|
||||
- **Code Quality**: Complexity and maintainability analysis
|
||||
|
||||
**Key Features:**
|
||||
- 🔒 Daily security scans
|
||||
- 📦 Dependency vulnerability detection
|
||||
- 📄 License compliance checking
|
||||
- 🔄 Automated update notifications
|
||||
- 📊 Code quality metrics
|
||||
|
||||
### 3. **Release Workflow** (`.gitea/workflows/release.yml`)
|
||||
|
||||
**Triggers:**
|
||||
- Git tags matching `v*.*.*` pattern
|
||||
- Manual dispatch with version input
|
||||
|
||||
**Jobs:**
|
||||
- **Create Release**: Automated release creation with changelog
|
||||
- **Test Release**: Comprehensive testing across Python versions
|
||||
- **Build Release**: Multi-architecture Docker images with proper tags
|
||||
- **Security Scan**: Trivy security scanning of release images
|
||||
- **Deploy Release**: Production deployment with health checks
|
||||
- **Validate Release**: Post-deployment validation and testing
|
||||
|
||||
**Key Features:**
|
||||
- 🏷️ Automated release creation
|
||||
- 📝 Changelog generation
|
||||
- 🐳 Multi-architecture Docker builds
|
||||
- 🔒 Security scanning
|
||||
- ✅ Comprehensive validation
|
||||
|
||||
### 4. **Documentation** (`.gitea/workflows/docs.yml`)
|
||||
|
||||
**Triggers:**
|
||||
- Changes to documentation files
|
||||
- Changes to Python source files
|
||||
- Manual dispatch
|
||||
|
||||
**Jobs:**
|
||||
- **Validate Docs**: Link checking and structure validation
|
||||
- **Generate API Docs**: OpenAPI specification generation
|
||||
- **Build Sphinx Docs**: Comprehensive API documentation
|
||||
- **Documentation Summary**: Build status and artifact summary
|
||||
|
||||
**Key Features:**
|
||||
- 📚 Automated API documentation
|
||||
- 🔗 Link validation
|
||||
- 📖 Sphinx documentation generation
|
||||
- ✅ Documentation completeness checking
|
||||
|
||||
## 🔧 Workflow Configuration
|
||||
|
||||
### **Required Secrets**
|
||||
|
||||
Configure these secrets in your Gitea repository settings:
|
||||
|
||||
```bash
|
||||
GITEA_TOKEN # Gitea access token for container registry
|
||||
SLACK_WEBHOOK_URL # Optional: Slack notifications
|
||||
STAGING_WEBHOOK_URL # Optional: Staging deployment webhook
|
||||
PRODUCTION_WEBHOOK_URL # Optional: Production deployment webhook
|
||||
```
|
||||
|
||||
### **Environment Variables**
|
||||
|
||||
Key environment variables used across workflows:
|
||||
|
||||
```yaml
|
||||
PYTHON_VERSION: '3.11' # Default Python version
|
||||
REGISTRY: git.b4l.co.th # Container registry
|
||||
IMAGE_NAME: grabowski/northern-thailand-ping-river-monitor
|
||||
```
|
||||
|
||||
## 📊 Workflow Status
|
||||
|
||||
### **CI/CD Pipeline Status**
|
||||
- **Test Coverage**: Multi-version Python testing
|
||||
- **Code Quality**: Automated linting and formatting
|
||||
- **Security**: Integrated security scanning
|
||||
- **Deployment**: Automated staging, manual production
|
||||
|
||||
### **Security Monitoring**
|
||||
- **Daily Scans**: Automated vulnerability detection
|
||||
- **Dependency Updates**: Proactive update notifications
|
||||
- **License Compliance**: Automated license checking
|
||||
- **Code Quality**: Continuous quality monitoring
|
||||
|
||||
### **Release Management**
|
||||
- **Automated Releases**: Tag-based release creation
|
||||
- **Multi-Architecture**: Support for amd64 and arm64
|
||||
- **Security Validation**: Pre-deployment security checks
|
||||
- **Health Monitoring**: Post-deployment validation
|
||||
|
||||
## 🚀 Usage Examples
|
||||
|
||||
### **Triggering Workflows**
|
||||
|
||||
**Manual CI/CD Run:**
|
||||
```bash
|
||||
# Push to trigger CI/CD
|
||||
git push origin main
|
||||
|
||||
# Create pull request to trigger testing
|
||||
git checkout -b feature/new-feature
|
||||
git push origin feature/new-feature
|
||||
# Create PR in Gitea UI
|
||||
```
|
||||
|
||||
**Manual Security Scan:**
|
||||
```bash
|
||||
# Trigger via Gitea Actions UI
|
||||
# Go to Actions → Security & Dependency Updates → Run workflow
|
||||
```
|
||||
|
||||
**Creating a Release:**
|
||||
```bash
|
||||
# Create and push a tag
|
||||
git tag v3.1.1
|
||||
git push origin v3.1.1
|
||||
|
||||
# Or use manual dispatch in Gitea Actions UI
|
||||
```
|
||||
|
||||
### **Monitoring Workflow Results**
|
||||
|
||||
**Check Workflow Status:**
|
||||
1. Navigate to your repository in Gitea
|
||||
2. Click on "Actions" tab
|
||||
3. View workflow runs and their status
|
||||
|
||||
**Download Artifacts:**
|
||||
1. Click on a completed workflow run
|
||||
2. Scroll to "Artifacts" section
|
||||
3. Download reports and logs
|
||||
|
||||
**View Security Reports:**
|
||||
1. Go to Security workflow runs
|
||||
2. Download security-reports artifacts
|
||||
3. Review JSON reports for vulnerabilities
|
||||
|
||||
## 🔍 Troubleshooting
|
||||
|
||||
### **Common Issues**
|
||||
|
||||
**Workflow Fails on Dependencies:**
|
||||
```bash
|
||||
# Check requirements.txt for version conflicts
|
||||
pip-compile requirements.in
|
||||
```
|
||||
|
||||
**Docker Build Fails:**
|
||||
```bash
|
||||
# Test Docker build locally
|
||||
make docker-build
|
||||
docker run --rm ping-river-monitor python run.py --test
|
||||
```
|
||||
|
||||
**Security Scan Failures:**
|
||||
```bash
|
||||
# Run security scans locally
|
||||
safety check -r requirements.txt
|
||||
bandit -r src/
|
||||
```
|
||||
|
||||
**Test Failures:**
|
||||
```bash
|
||||
# Run tests locally
|
||||
make test
|
||||
python tests/test_integration.py
|
||||
```
|
||||
|
||||
### **Debugging Workflows**
|
||||
|
||||
**Enable Debug Logging:**
|
||||
Add to workflow file:
|
||||
```yaml
|
||||
env:
|
||||
ACTIONS_STEP_DEBUG: true
|
||||
ACTIONS_RUNNER_DEBUG: true
|
||||
```
|
||||
|
||||
**Check Workflow Logs:**
|
||||
1. Go to failed workflow run
|
||||
2. Click on failed job
|
||||
3. Expand failed step to see detailed logs
|
||||
|
||||
**Validate Workflow Syntax:**
|
||||
```bash
|
||||
# Validate YAML syntax
|
||||
make validate-workflows
|
||||
```
|
||||
|
||||
## 📈 Performance Optimization
|
||||
|
||||
### **Caching Strategy**
|
||||
- **Pip Cache**: Cached across workflow runs
|
||||
- **Docker Layer Cache**: GitHub Actions cache for faster builds
|
||||
- **Dependency Cache**: Cached based on requirements.txt hash
|
||||
|
||||
### **Parallel Execution**
|
||||
- **Matrix Builds**: Multiple Python versions tested in parallel
|
||||
- **Independent Jobs**: Security scans run independently of tests
|
||||
- **Conditional Execution**: Jobs skip when not needed
|
||||
|
||||
### **Resource Management**
|
||||
- **Timeout Settings**: Prevent hanging workflows
|
||||
- **Resource Limits**: Appropriate runner sizing
|
||||
- **Artifact Cleanup**: Automatic cleanup of old artifacts
|
||||
|
||||
## 🔒 Security Best Practices
|
||||
|
||||
### **Secret Management**
|
||||
- Use Gitea repository secrets for sensitive data
|
||||
- Never commit secrets to repository
|
||||
- Rotate secrets regularly
|
||||
- Use least-privilege access tokens
|
||||
|
||||
### **Container Security**
|
||||
- Multi-stage Docker builds for smaller images
|
||||
- Non-root user in containers
|
||||
- Regular base image updates
|
||||
- Vulnerability scanning before deployment
|
||||
|
||||
### **Code Security**
|
||||
- Automated security scanning in CI/CD
|
||||
- Dependency vulnerability monitoring
|
||||
- License compliance checking
|
||||
- Code quality enforcement
|
||||
|
||||
## 📚 Additional Resources
|
||||
|
||||
### **Gitea Actions Documentation**
|
||||
- [Gitea Actions Overview](https://docs.gitea.io/en-us/usage/actions/)
|
||||
- [Workflow Syntax](https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions)
|
||||
- [Available Actions](https://github.com/marketplace?type=actions)
|
||||
|
||||
### **Project-Specific Resources**
|
||||
- [Contributing Guide](../CONTRIBUTING.md)
|
||||
- [Deployment Checklist](../DEPLOYMENT_CHECKLIST.md)
|
||||
- [Project Structure](PROJECT_STRUCTURE.md)
|
||||
|
||||
### **Monitoring and Alerts**
|
||||
- Workflow status badges in README
|
||||
- Email notifications for failures
|
||||
- Slack/Discord integration for team updates
|
||||
- Grafana dashboards for deployment metrics
|
||||
|
||||
---
|
||||
|
||||
**Workflow Version**: v3.1.0
|
||||
**Last Updated**: 2025-08-12
|
||||
**Maintained By**: Ping River Monitor Team
|
||||
389
docs/HTTPS_CONFIGURATION.md
Normal file
389
docs/HTTPS_CONFIGURATION.md
Normal file
@@ -0,0 +1,389 @@
|
||||
# HTTPS VictoriaMetrics Configuration Guide
|
||||
|
||||
This guide explains how to configure the Thailand Water Monitor to connect to VictoriaMetrics through HTTPS and reverse proxies.
|
||||
|
||||
## Configuration Options
|
||||
|
||||
### 1. Environment Variables for HTTPS
|
||||
|
||||
```bash
|
||||
# Option 1: Full HTTPS URL (Recommended)
|
||||
export DB_TYPE=victoriametrics
|
||||
export VM_HOST=https://vm.example.com
|
||||
export VM_PORT=443
|
||||
|
||||
# Option 2: Host and port separately
|
||||
export DB_TYPE=victoriametrics
|
||||
export VM_HOST=vm.example.com
|
||||
export VM_PORT=443
|
||||
|
||||
# Option 3: Custom port with HTTPS
|
||||
export DB_TYPE=victoriametrics
|
||||
export VM_HOST=https://vm.example.com
|
||||
export VM_PORT=8443
|
||||
```
|
||||
|
||||
### 2. Windows PowerShell Configuration
|
||||
|
||||
```powershell
|
||||
# Set environment variables for HTTPS
|
||||
$env:DB_TYPE="victoriametrics"
|
||||
$env:VM_HOST="https://vm.example.com"
|
||||
$env:VM_PORT="443"
|
||||
|
||||
# Run the water monitor
|
||||
python water_scraper_v3.py
|
||||
```
|
||||
|
||||
### 3. Linux/Mac Configuration
|
||||
|
||||
```bash
|
||||
# Set environment variables for HTTPS
|
||||
export DB_TYPE=victoriametrics
|
||||
export VM_HOST=https://vm.example.com
|
||||
export VM_PORT=443
|
||||
|
||||
# Run the water monitor
|
||||
python water_scraper_v3.py
|
||||
```
|
||||
|
||||
## Reverse Proxy Examples
|
||||
|
||||
### 1. Nginx Reverse Proxy
|
||||
|
||||
```nginx
|
||||
server {
|
||||
listen 443 ssl http2;
|
||||
server_name vm.example.com;
|
||||
|
||||
# SSL Configuration
|
||||
ssl_certificate /path/to/certificate.crt;
|
||||
ssl_certificate_key /path/to/private.key;
|
||||
ssl_protocols TLSv1.2 TLSv1.3;
|
||||
ssl_ciphers ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512;
|
||||
|
||||
# Security headers
|
||||
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
|
||||
add_header X-Frame-Options DENY always;
|
||||
add_header X-Content-Type-Options nosniff always;
|
||||
|
||||
# Optional: Basic authentication
|
||||
# auth_basic "VictoriaMetrics";
|
||||
# auth_basic_user_file /etc/nginx/.htpasswd;
|
||||
|
||||
location / {
|
||||
proxy_pass http://localhost:8428;
|
||||
proxy_set_header Host $host;
|
||||
proxy_set_header X-Real-IP $remote_addr;
|
||||
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
||||
proxy_set_header X-Forwarded-Proto $scheme;
|
||||
|
||||
# WebSocket support (if needed)
|
||||
proxy_http_version 1.1;
|
||||
proxy_set_header Upgrade $http_upgrade;
|
||||
proxy_set_header Connection "upgrade";
|
||||
|
||||
# Timeouts
|
||||
proxy_connect_timeout 60s;
|
||||
proxy_send_timeout 60s;
|
||||
proxy_read_timeout 60s;
|
||||
}
|
||||
}
|
||||
|
||||
# Redirect HTTP to HTTPS
|
||||
server {
|
||||
listen 80;
|
||||
server_name vm.example.com;
|
||||
return 301 https://$server_name$request_uri;
|
||||
}
|
||||
```
|
||||
|
||||
### 2. Apache Reverse Proxy
|
||||
|
||||
```apache
|
||||
<VirtualHost *:443>
|
||||
ServerName vm.example.com
|
||||
|
||||
# SSL Configuration
|
||||
SSLEngine on
|
||||
SSLCertificateFile /path/to/certificate.crt
|
||||
SSLCertificateKeyFile /path/to/private.key
|
||||
SSLProtocol all -SSLv3 -TLSv1 -TLSv1.1
|
||||
SSLCipherSuite ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384
|
||||
|
||||
# Security headers
|
||||
Header always set Strict-Transport-Security "max-age=31536000; includeSubDomains"
|
||||
Header always set X-Frame-Options DENY
|
||||
Header always set X-Content-Type-Options nosniff
|
||||
|
||||
# Reverse proxy configuration
|
||||
ProxyPreserveHost On
|
||||
ProxyPass / http://localhost:8428/
|
||||
ProxyPassReverse / http://localhost:8428/
|
||||
|
||||
# Optional: Basic authentication
|
||||
# AuthType Basic
|
||||
# AuthName "VictoriaMetrics"
|
||||
# AuthUserFile /etc/apache2/.htpasswd
|
||||
# Require valid-user
|
||||
</VirtualHost>
|
||||
|
||||
<VirtualHost *:80>
|
||||
ServerName vm.example.com
|
||||
Redirect permanent / https://vm.example.com/
|
||||
</VirtualHost>
|
||||
```
|
||||
|
||||
### 3. Traefik Reverse Proxy
|
||||
|
||||
```yaml
|
||||
# docker-compose.yml with Traefik
|
||||
version: '3.8'
|
||||
|
||||
services:
|
||||
traefik:
|
||||
image: traefik:v2.10
|
||||
command:
|
||||
- --api.dashboard=true
|
||||
- --entrypoints.web.address=:80
|
||||
- --entrypoints.websecure.address=:443
|
||||
- --providers.docker=true
|
||||
- --certificatesresolvers.letsencrypt.acme.tlschallenge=true
|
||||
- --certificatesresolvers.letsencrypt.acme.email=admin@example.com
|
||||
- --certificatesresolvers.letsencrypt.acme.storage=/letsencrypt/acme.json
|
||||
ports:
|
||||
- "80:80"
|
||||
- "443:443"
|
||||
volumes:
|
||||
- /var/run/docker.sock:/var/run/docker.sock
|
||||
- letsencrypt:/letsencrypt
|
||||
labels:
|
||||
- traefik.http.routers.api.rule=Host(`traefik.example.com`)
|
||||
- traefik.http.routers.api.tls.certresolver=letsencrypt
|
||||
|
||||
victoriametrics:
|
||||
image: victoriametrics/victoria-metrics:latest
|
||||
command:
|
||||
- '--storageDataPath=/victoria-metrics-data'
|
||||
- '--retentionPeriod=2y'
|
||||
- '--httpListenAddr=:8428'
|
||||
volumes:
|
||||
- vm_data:/victoria-metrics-data
|
||||
labels:
|
||||
- traefik.enable=true
|
||||
- traefik.http.routers.vm.rule=Host(`vm.example.com`)
|
||||
- traefik.http.routers.vm.tls.certresolver=letsencrypt
|
||||
- traefik.http.services.vm.loadbalancer.server.port=8428
|
||||
|
||||
volumes:
|
||||
vm_data:
|
||||
letsencrypt:
|
||||
```
|
||||
|
||||
## Testing HTTPS Configuration
|
||||
|
||||
### 1. Test Connection
|
||||
|
||||
```bash
|
||||
# Test HTTPS connection
|
||||
curl -k https://vm.example.com/health
|
||||
|
||||
# Test with specific port
|
||||
curl -k https://vm.example.com:8443/health
|
||||
|
||||
# Test API endpoint
|
||||
curl -k "https://vm.example.com/api/v1/query?query=up"
|
||||
```
|
||||
|
||||
### 2. Test with Water Monitor
|
||||
|
||||
```bash
|
||||
# Set environment variables
|
||||
export DB_TYPE=victoriametrics
|
||||
export VM_HOST=https://vm.example.com
|
||||
export VM_PORT=443
|
||||
|
||||
# Test with demo script
|
||||
python demo_databases.py victoriametrics
|
||||
|
||||
# Run full water monitor
|
||||
python water_scraper_v3.py
|
||||
```
|
||||
|
||||
### 3. Verify SSL Certificate
|
||||
|
||||
```bash
|
||||
# Check SSL certificate
|
||||
openssl s_client -connect vm.example.com:443 -servername vm.example.com
|
||||
|
||||
# Check certificate expiration
|
||||
echo | openssl s_client -connect vm.example.com:443 2>/dev/null | openssl x509 -noout -dates
|
||||
```
|
||||
|
||||
## Configuration Examples
|
||||
|
||||
### 1. Production HTTPS Setup
|
||||
|
||||
```bash
|
||||
# Environment variables for production
|
||||
export DB_TYPE=victoriametrics
|
||||
export VM_HOST=https://metrics.company.com
|
||||
export VM_PORT=443
|
||||
export LOG_LEVEL=INFO
|
||||
export SCRAPING_INTERVAL_HOURS=1
|
||||
|
||||
# Run water monitor
|
||||
python water_scraper_v3.py
|
||||
```
|
||||
|
||||
### 2. Development with Self-Signed Certificate
|
||||
|
||||
```bash
|
||||
# For development with self-signed certificates
|
||||
export DB_TYPE=victoriametrics
|
||||
export VM_HOST=https://dev-vm.local
|
||||
export VM_PORT=443
|
||||
export PYTHONHTTPSVERIFY=0 # Disable SSL verification (dev only)
|
||||
|
||||
python water_scraper_v3.py
|
||||
```
|
||||
|
||||
### 3. Custom Port Configuration
|
||||
|
||||
```bash
|
||||
# Custom HTTPS port
|
||||
export DB_TYPE=victoriametrics
|
||||
export VM_HOST=https://vm.example.com
|
||||
export VM_PORT=8443
|
||||
|
||||
python water_scraper_v3.py
|
||||
```
|
||||
|
||||
## Troubleshooting HTTPS Issues
|
||||
|
||||
### 1. SSL Certificate Errors
|
||||
|
||||
```bash
|
||||
# Error: SSL certificate verify failed
|
||||
# Solution: Check certificate validity
|
||||
openssl x509 -in certificate.crt -text -noout
|
||||
|
||||
# Temporary workaround (not recommended for production)
|
||||
export PYTHONHTTPSVERIFY=0
|
||||
```
|
||||
|
||||
### 2. Connection Timeout
|
||||
|
||||
```bash
|
||||
# Error: Connection timeout
|
||||
# Check firewall and network connectivity
|
||||
telnet vm.example.com 443
|
||||
nc -zv vm.example.com 443
|
||||
```
|
||||
|
||||
### 3. DNS Resolution Issues
|
||||
|
||||
```bash
|
||||
# Error: Name resolution failed
|
||||
# Check DNS resolution
|
||||
nslookup vm.example.com
|
||||
dig vm.example.com
|
||||
```
|
||||
|
||||
### 4. Proxy Configuration Issues
|
||||
|
||||
```bash
|
||||
# Check proxy logs
|
||||
# Nginx
|
||||
tail -f /var/log/nginx/error.log
|
||||
|
||||
# Apache
|
||||
tail -f /var/log/apache2/error.log
|
||||
|
||||
# Test direct connection to backend
|
||||
curl http://localhost:8428/health
|
||||
```
|
||||
|
||||
## Security Best Practices
|
||||
|
||||
### 1. SSL/TLS Configuration
|
||||
|
||||
- Use TLS 1.2 or higher
|
||||
- Disable weak ciphers
|
||||
- Enable HSTS headers
|
||||
- Use strong SSL certificates
|
||||
|
||||
### 2. Authentication
|
||||
|
||||
```nginx
|
||||
# Basic authentication in Nginx
|
||||
auth_basic "VictoriaMetrics Access";
|
||||
auth_basic_user_file /etc/nginx/.htpasswd;
|
||||
|
||||
# Create password file
|
||||
htpasswd -c /etc/nginx/.htpasswd username
|
||||
```
|
||||
|
||||
### 3. Network Security
|
||||
|
||||
- Use firewall rules to restrict access
|
||||
- Consider VPN for internal access
|
||||
- Implement rate limiting
|
||||
- Monitor access logs
|
||||
|
||||
### 4. Certificate Management
|
||||
|
||||
```bash
|
||||
# Auto-renewal with Let's Encrypt
|
||||
certbot renew --dry-run
|
||||
|
||||
# Certificate monitoring
|
||||
echo | openssl s_client -connect vm.example.com:443 2>/dev/null | \
|
||||
openssl x509 -noout -dates | grep notAfter
|
||||
```
|
||||
|
||||
## Docker Configuration for HTTPS
|
||||
|
||||
### 1. Docker Compose with HTTPS
|
||||
|
||||
```yaml
|
||||
version: '3.8'
|
||||
|
||||
services:
|
||||
water-monitor:
|
||||
build: .
|
||||
environment:
|
||||
- DB_TYPE=victoriametrics
|
||||
- VM_HOST=https://vm.example.com
|
||||
- VM_PORT=443
|
||||
restart: unless-stopped
|
||||
depends_on:
|
||||
- victoriametrics
|
||||
|
||||
victoriametrics:
|
||||
image: victoriametrics/victoria-metrics:latest
|
||||
ports:
|
||||
- "8428:8428"
|
||||
volumes:
|
||||
- vm_data:/victoria-metrics-data
|
||||
command:
|
||||
- '--storageDataPath=/victoria-metrics-data'
|
||||
- '--retentionPeriod=2y'
|
||||
- '--httpListenAddr=:8428'
|
||||
|
||||
volumes:
|
||||
vm_data:
|
||||
```
|
||||
|
||||
### 2. Environment File (.env)
|
||||
|
||||
```bash
|
||||
# .env file
|
||||
DB_TYPE=victoriametrics
|
||||
VM_HOST=https://vm.example.com
|
||||
VM_PORT=443
|
||||
LOG_LEVEL=INFO
|
||||
SCRAPING_INTERVAL_HOURS=1
|
||||
```
|
||||
|
||||
This configuration guide provides comprehensive instructions for setting up HTTPS connectivity to VictoriaMetrics through reverse proxies, ensuring secure and reliable data transmission for the Thailand Water Monitor.
|
||||
136
docs/MIGRATION_QUICKSTART.md
Normal file
136
docs/MIGRATION_QUICKSTART.md
Normal file
@@ -0,0 +1,136 @@
|
||||
# Geolocation Migration Quick Start
|
||||
|
||||
This is a quick reference guide for updating a running Thailand Water Monitor system to add geolocation support for Grafana geomap.
|
||||
|
||||
## 🚀 **Quick Migration (5 minutes)**
|
||||
|
||||
### **Step 1: Stop Application**
|
||||
```bash
|
||||
# Stop the service (choose your method)
|
||||
sudo systemctl stop water-monitor
|
||||
# OR
|
||||
docker stop water-monitor
|
||||
# OR use Ctrl+C if running manually
|
||||
```
|
||||
|
||||
### **Step 2: Backup Database**
|
||||
```bash
|
||||
# SQLite backup
|
||||
cp water_monitoring.db water_monitoring.db.backup
|
||||
|
||||
# PostgreSQL backup
|
||||
pg_dump water_monitoring > backup.sql
|
||||
|
||||
# MySQL backup
|
||||
mysqldump water_monitoring > backup.sql
|
||||
```
|
||||
|
||||
### **Step 3: Run Migration**
|
||||
```bash
|
||||
# Run the automated migration script
|
||||
python migrate_geolocation.py
|
||||
```
|
||||
|
||||
### **Step 4: Restart Application**
|
||||
```bash
|
||||
# Restart the service
|
||||
sudo systemctl start water-monitor
|
||||
# OR
|
||||
docker start water-monitor
|
||||
# OR
|
||||
python water_scraper_v3.py
|
||||
```
|
||||
|
||||
## ✅ **Expected Output**
|
||||
```
|
||||
2025-07-28 17:30:00,123 - INFO - Starting geolocation column migration...
|
||||
2025-07-28 17:30:00,124 - INFO - Detected database type: SQLITE
|
||||
2025-07-28 17:30:00,127 - INFO - Added latitude column
|
||||
2025-07-28 17:30:00,128 - INFO - Added longitude column
|
||||
2025-07-28 17:30:00,129 - INFO - Added geohash column
|
||||
2025-07-28 17:30:00,133 - INFO - ✅ Migration completed successfully!
|
||||
```
|
||||
|
||||
## 🗺️ **Verify Geolocation Works**
|
||||
|
||||
### **Check Database**
|
||||
```bash
|
||||
# SQLite
|
||||
sqlite3 water_monitoring.db "SELECT station_code, latitude, longitude, geohash FROM stations WHERE station_code = 'P.1';"
|
||||
|
||||
# Expected output: P.1|15.6944|100.2028|w5q6uuhvfcfp25
|
||||
```
|
||||
|
||||
### **Test Application**
|
||||
```bash
|
||||
# Run a test cycle
|
||||
python water_scraper_v3.py --test
|
||||
|
||||
# Should complete without errors
|
||||
```
|
||||
|
||||
## 🔧 **Grafana Setup**
|
||||
|
||||
### **Query for Geomap**
|
||||
```sql
|
||||
SELECT
|
||||
s.latitude, s.longitude, s.station_code, s.english_name,
|
||||
m.water_level, m.discharge_percent
|
||||
FROM stations s
|
||||
JOIN water_measurements m ON s.id = m.station_id
|
||||
WHERE s.latitude IS NOT NULL
|
||||
AND m.timestamp = (SELECT MAX(timestamp) FROM water_measurements WHERE station_id = s.id)
|
||||
```
|
||||
|
||||
### **Geomap Configuration**
|
||||
1. Create new panel → Select "Geomap"
|
||||
2. Set **Latitude field**: `latitude`
|
||||
3. Set **Longitude field**: `longitude`
|
||||
4. Set **Color field**: `water_level`
|
||||
5. Set **Size field**: `discharge_percent`
|
||||
|
||||
## 🚨 **Troubleshooting**
|
||||
|
||||
### **Database Locked**
|
||||
```bash
|
||||
sudo systemctl stop water-monitor
|
||||
pkill -f water_scraper
|
||||
sleep 5
|
||||
python migrate_geolocation.py
|
||||
```
|
||||
|
||||
### **Permission Error**
|
||||
```bash
|
||||
sudo chown $USER:$USER water_monitoring.db
|
||||
chmod 664 water_monitoring.db
|
||||
```
|
||||
|
||||
### **Missing Dependencies**
|
||||
```bash
|
||||
pip install psycopg2-binary pymysql
|
||||
```
|
||||
|
||||
## 🔄 **Rollback (if needed)**
|
||||
```bash
|
||||
# Stop application
|
||||
sudo systemctl stop water-monitor
|
||||
|
||||
# Restore backup
|
||||
cp water_monitoring.db.backup water_monitoring.db
|
||||
|
||||
# Restart
|
||||
sudo systemctl start water-monitor
|
||||
```
|
||||
|
||||
## 📚 **More Information**
|
||||
- **Full Guide**: See `GEOLOCATION_GUIDE.md`
|
||||
- **Migration Script**: `migrate_geolocation.py`
|
||||
- **Database Schema**: Updated with latitude, longitude, geohash columns
|
||||
|
||||
## 🎯 **What You Get**
|
||||
- ✅ **P.1 Station** ready for geomap (Nawarat Bridge)
|
||||
- ✅ **Database Schema** updated for all 16 stations
|
||||
- ✅ **Grafana Compatible** data structure
|
||||
- ✅ **Backward Compatible** - existing data preserved
|
||||
|
||||
**Total Time**: ~5 minutes for complete migration
|
||||
206
docs/PROJECT_STATUS.md
Normal file
206
docs/PROJECT_STATUS.md
Normal file
@@ -0,0 +1,206 @@
|
||||
# Thailand Water Monitor - Current Project Status
|
||||
|
||||
## 📁 **Clean Project Structure**
|
||||
|
||||
The project has been cleaned up and organized with the following structure:
|
||||
|
||||
```
|
||||
water_level_monitor/
|
||||
├── 📄 .gitignore # Git ignore rules
|
||||
├── 📄 README.md # Main project documentation
|
||||
├── 📄 requirements.txt # Python dependencies
|
||||
├── 📄 config.py # Configuration management
|
||||
├── 📄 water_scraper_v3.py # Main application (15-min scheduler)
|
||||
├── 📄 database_adapters.py # Multi-database support
|
||||
├── 📄 demo_databases.py # Database demonstration
|
||||
├── 📄 Dockerfile # Container configuration
|
||||
├── 📄 docker-compose.victoriametrics.yml # VictoriaMetrics stack
|
||||
├── 📚 Documentation/
|
||||
│ ├── 📄 DATABASE_DEPLOYMENT_GUIDE.md # Multi-database setup guide
|
||||
│ ├── 📄 DEBIAN_TROUBLESHOOTING.md # Linux deployment guide
|
||||
│ ├── 📄 ENHANCED_SCHEDULER_GUIDE.md # 15-minute scheduler guide
|
||||
│ ├── 📄 GAP_FILLING_GUIDE.md # Data gap filling guide
|
||||
│ ├── 📄 HTTPS_CONFIGURATION.md # HTTPS setup guide
|
||||
│ └── 📄 VICTORIAMETRICS_SETUP.md # VictoriaMetrics guide
|
||||
└── 📁 grafana/ # Grafana configuration
|
||||
├── 📁 provisioning/
|
||||
│ ├── 📁 datasources/
|
||||
│ │ └── 📄 victoriametrics.yml # VictoriaMetrics data source
|
||||
│ └── 📁 dashboards/
|
||||
│ └── 📄 dashboard.yml # Dashboard provider config
|
||||
└── 📁 dashboards/
|
||||
└── 📄 water-monitoring-dashboard.json # Pre-built dashboard
|
||||
```
|
||||
|
||||
## 🧹 **Files Removed During Cleanup**
|
||||
|
||||
### **Old Data Files**
|
||||
- ❌ `thailand_water_data_v2.csv` - Old CSV export
|
||||
- ❌ `water_monitor.log` - Log file (regenerated automatically)
|
||||
- ❌ `water_monitoring.db` - SQLite database (recreated automatically)
|
||||
|
||||
### **Outdated Documentation**
|
||||
- ❌ `FINAL_SUMMARY.md` - Contained references to non-existent v2 files
|
||||
- ❌ `PROJECT_SUMMARY.md` - Outdated project information
|
||||
|
||||
### **System Files**
|
||||
- ❌ `__pycache__/` - Python compiled files directory
|
||||
|
||||
## ✅ **Current Features**
|
||||
|
||||
### **Enhanced 15-Minute Scheduler**
|
||||
- **Timing**: Runs every 15 minutes (1:00, 1:15, 1:30, 1:45, 2:00, etc.)
|
||||
- **Full Checks**: At :00 minutes (gap filling + data updates)
|
||||
- **Quick Checks**: At :15, :30, :45 minutes (data fetch only)
|
||||
- **Gap Filling**: Automatically fills missing historical data
|
||||
- **Data Updates**: Updates existing records when values change
|
||||
|
||||
### **Multi-Database Support**
|
||||
- **VictoriaMetrics** (Recommended) - High-performance time-series
|
||||
- **InfluxDB** - Purpose-built time-series database
|
||||
- **PostgreSQL + TimescaleDB** - Relational with time-series optimization
|
||||
- **MySQL** - Traditional relational database
|
||||
- **SQLite** - Local development and testing
|
||||
|
||||
### **Production Features**
|
||||
- **Docker Support**: Complete containerization
|
||||
- **Grafana Integration**: Pre-built dashboards
|
||||
- **HTTPS Configuration**: Secure deployment options
|
||||
- **Health Monitoring**: Comprehensive logging and error handling
|
||||
- **Gap Detection**: Automatic identification of missing data
|
||||
- **Retry Logic**: Database lock handling and network error recovery
|
||||
|
||||
## 🚀 **Quick Start**
|
||||
|
||||
### **1. Basic Setup (SQLite)**
|
||||
```bash
|
||||
cd water_level_monitor
|
||||
pip install -r requirements.txt
|
||||
python water_scraper_v3.py
|
||||
```
|
||||
|
||||
### **2. VictoriaMetrics Setup**
|
||||
```bash
|
||||
# Start VictoriaMetrics + Grafana
|
||||
docker-compose -f docker-compose.victoriametrics.yml up -d
|
||||
|
||||
# Configure environment
|
||||
export DB_TYPE=victoriametrics
|
||||
export VM_HOST=localhost
|
||||
export VM_PORT=8428
|
||||
|
||||
# Run monitor
|
||||
python water_scraper_v3.py
|
||||
```
|
||||
|
||||
### **3. Test Different Databases**
|
||||
```bash
|
||||
# Test all supported databases
|
||||
python demo_databases.py all
|
||||
|
||||
# Test specific database
|
||||
python demo_databases.py victoriametrics
|
||||
```
|
||||
|
||||
## 📊 **Data Collection**
|
||||
|
||||
### **Station Coverage**
|
||||
- **16 Water Monitoring Stations** across Thailand
|
||||
- **Accurate Station Codes**: P.1, P.20, P.21, P.4A, P.5, P.67, P.75, P.76, P.77, P.81, P.82, P.84, P.85, P.87, P.92, P.103
|
||||
- **Bilingual Names**: Thai and English station identification
|
||||
|
||||
### **Metrics Collected**
|
||||
- 🌊 **Water Level**: Measured in meters (m)
|
||||
- 💧 **Discharge**: Measured in cubic meters per second (cms)
|
||||
- 📊 **Discharge Percentage**: Relative to station capacity
|
||||
- ⏰ **Timestamp**: Hour 24 handling (midnight = 00:00 next day)
|
||||
|
||||
### **Data Frequency**
|
||||
- **Every 15 Minutes**: Continuous monitoring
|
||||
- **~300+ Data Points**: Per collection cycle
|
||||
- **Automatic Gap Filling**: Historical data recovery
|
||||
- **Data Updates**: Changed values detection and correction
|
||||
|
||||
## 🔧 **Command Line Tools**
|
||||
|
||||
### **Main Application**
|
||||
```bash
|
||||
python water_scraper_v3.py # Run continuous monitoring
|
||||
python water_scraper_v3.py --test # Single test cycle
|
||||
python water_scraper_v3.py --help # Show help
|
||||
```
|
||||
|
||||
### **Gap Management**
|
||||
```bash
|
||||
python water_scraper_v3.py --check-gaps [days] # Check for missing data
|
||||
python water_scraper_v3.py --fill-gaps [days] # Fill missing data gaps
|
||||
python water_scraper_v3.py --update-data [days] # Update existing data
|
||||
```
|
||||
|
||||
### **Database Testing**
|
||||
```bash
|
||||
python demo_databases.py # SQLite demo
|
||||
python demo_databases.py victoriametrics # VictoriaMetrics demo
|
||||
python demo_databases.py all # Test all databases
|
||||
```
|
||||
|
||||
## 📈 **Monitoring & Visualization**
|
||||
|
||||
### **Grafana Dashboard**
|
||||
- **URL**: http://localhost:3000 (when using docker-compose)
|
||||
- **Username**: admin
|
||||
- **Password**: admin_password
|
||||
- **Features**: Time series charts, status tables, gauges, alerts
|
||||
|
||||
### **VictoriaMetrics API**
|
||||
- **URL**: http://localhost:8428
|
||||
- **Health**: http://localhost:8428/health
|
||||
- **Metrics**: http://localhost:8428/metrics
|
||||
- **Query API**: http://localhost:8428/api/v1/query
|
||||
|
||||
## 🛡️ **Security & Production**
|
||||
|
||||
### **HTTPS Configuration**
|
||||
- Complete guide in `HTTPS_CONFIGURATION.md`
|
||||
- SSL certificate setup
|
||||
- Reverse proxy configuration
|
||||
- Security best practices
|
||||
|
||||
### **Deployment Options**
|
||||
- **Docker**: Containerized deployment
|
||||
- **Systemd**: Linux service configuration
|
||||
- **Cloud**: AWS, GCP, Azure deployment guides
|
||||
- **Monitoring**: Health checks and alerting
|
||||
|
||||
## 📚 **Documentation**
|
||||
|
||||
### **Available Guides**
|
||||
1. **README.md** - Main project documentation
|
||||
2. **DATABASE_DEPLOYMENT_GUIDE.md** - Multi-database setup
|
||||
3. **ENHANCED_SCHEDULER_GUIDE.md** - 15-minute scheduler details
|
||||
4. **GAP_FILLING_GUIDE.md** - Data integrity and gap filling
|
||||
5. **DEBIAN_TROUBLESHOOTING.md** - Linux deployment troubleshooting
|
||||
6. **VICTORIAMETRICS_SETUP.md** - VictoriaMetrics configuration
|
||||
7. **HTTPS_CONFIGURATION.md** - Secure deployment setup
|
||||
|
||||
### **Key Features Documented**
|
||||
- ✅ Installation and configuration
|
||||
- ✅ Multi-database support
|
||||
- ✅ 15-minute scheduling system
|
||||
- ✅ Gap filling and data integrity
|
||||
- ✅ Production deployment
|
||||
- ✅ Monitoring and troubleshooting
|
||||
- ✅ Security configuration
|
||||
|
||||
## 🎯 **Project Status: PRODUCTION READY**
|
||||
|
||||
The Thailand Water Monitor is now:
|
||||
- ✅ **Clean**: All old and redundant files removed
|
||||
- ✅ **Organized**: Clear project structure with proper documentation
|
||||
- ✅ **Enhanced**: 15-minute scheduling with gap filling
|
||||
- ✅ **Scalable**: Multi-database support with VictoriaMetrics
|
||||
- ✅ **Secure**: HTTPS configuration and security best practices
|
||||
- ✅ **Monitored**: Comprehensive logging and Grafana dashboards
|
||||
- ✅ **Documented**: Complete guides for all features and deployment options
|
||||
|
||||
The project is ready for production deployment with professional-grade monitoring capabilities.
|
||||
272
docs/PROJECT_STRUCTURE.md
Normal file
272
docs/PROJECT_STRUCTURE.md
Normal file
@@ -0,0 +1,272 @@
|
||||
# 🏗️ Project Structure - Northern Thailand Ping River Monitor
|
||||
|
||||
## 📁 Directory Layout
|
||||
|
||||
```
|
||||
Northern-Thailand-Ping-River-Monitor/
|
||||
├── 📁 src/ # Main application source code
|
||||
│ ├── __init__.py # Package initialization
|
||||
│ ├── main.py # CLI entry point and main application
|
||||
│ ├── water_scraper_v3.py # Core data collection engine
|
||||
│ ├── web_api.py # FastAPI web interface
|
||||
│ ├── config.py # Configuration management
|
||||
│ ├── database_adapters.py # Database abstraction layer
|
||||
│ ├── models.py # Data models and type definitions
|
||||
│ ├── exceptions.py # Custom exception classes
|
||||
│ ├── validators.py # Data validation layer
|
||||
│ ├── metrics.py # Metrics collection system
|
||||
│ ├── health_check.py # Health monitoring system
|
||||
│ ├── rate_limiter.py # Rate limiting and request tracking
|
||||
│ └── logging_config.py # Enhanced logging configuration
|
||||
├── 📁 docs/ # Documentation files
|
||||
│ ├── STATION_MANAGEMENT_GUIDE.md # Station management documentation
|
||||
│ ├── ENHANCEMENT_SUMMARY.md # Feature enhancement summary
|
||||
│ └── PROJECT_STRUCTURE.md # This file
|
||||
├── 📁 scripts/ # Utility scripts
|
||||
│ └── migrate_geolocation.py # Database migration script
|
||||
├── 📁 grafana/ # Grafana configuration
|
||||
│ ├── dashboards/ # Dashboard definitions
|
||||
│ └── provisioning/ # Grafana provisioning config
|
||||
├── 📁 tests/ # Test files
|
||||
│ ├── test_integration.py # Integration test suite
|
||||
│ ├── test_station_management.py # Station management tests
|
||||
│ └── test_api.py # API endpoint tests
|
||||
├── 📄 run.py # Simple startup script
|
||||
├── 📄 requirements.txt # Production dependencies
|
||||
├── 📄 requirements-dev.txt # Development dependencies
|
||||
├── 📄 setup.py # Package installation script
|
||||
├── 📄 Dockerfile # Docker container definition
|
||||
├── 📄 docker-compose.victoriametrics.yml # Complete stack deployment
|
||||
├── 📄 Makefile # Common development tasks
|
||||
├── 📄 .env.example # Environment configuration template
|
||||
├── 📄 .gitignore # Git ignore patterns
|
||||
├── 📄 .gitlab-ci.yml # CI/CD pipeline configuration
|
||||
├── 📄 LICENSE # MIT license
|
||||
├── 📄 README.md # Main project documentation
|
||||
└── 📄 CONTRIBUTING.md # Contribution guidelines
|
||||
```
|
||||
|
||||
## 🔧 Core Components
|
||||
|
||||
### **Application Layer**
|
||||
- **`src/main.py`** - Command-line interface and application orchestration
|
||||
- **`src/web_api.py`** - FastAPI web interface with REST endpoints
|
||||
- **`src/water_scraper_v3.py`** - Core data collection and processing engine
|
||||
|
||||
### **Data Layer**
|
||||
- **`src/database_adapters.py`** - Multi-database support (SQLite, MySQL, PostgreSQL, InfluxDB, VictoriaMetrics)
|
||||
- **`src/models.py`** - Pydantic data models and type definitions
|
||||
- **`src/validators.py`** - Data validation and sanitization
|
||||
|
||||
### **Infrastructure Layer**
|
||||
- **`src/config.py`** - Configuration management with environment variable support
|
||||
- **`src/logging_config.py`** - Structured logging with rotation and colors
|
||||
- **`src/metrics.py`** - Application metrics collection (counters, gauges, histograms)
|
||||
- **`src/health_check.py`** - System health monitoring and status checks
|
||||
|
||||
### **Utility Layer**
|
||||
- **`src/exceptions.py`** - Custom exception hierarchy
|
||||
- **`src/rate_limiter.py`** - API rate limiting and request tracking
|
||||
|
||||
## 🌐 Web API Structure
|
||||
|
||||
### **Endpoints Organization**
|
||||
```
|
||||
/ # Dashboard homepage
|
||||
├── /health # System health status
|
||||
├── /metrics # Application metrics
|
||||
├── /config # Configuration (masked)
|
||||
├── /stations # Station management
|
||||
│ ├── GET / # List all stations
|
||||
│ ├── POST / # Create new station
|
||||
│ ├── GET /{id} # Get specific station
|
||||
│ ├── PUT /{id} # Update station
|
||||
│ └── DELETE /{id} # Delete station
|
||||
├── /measurements # Data access
|
||||
│ ├── /latest # Latest measurements
|
||||
│ └── /station/{code} # Station-specific data
|
||||
└── /scraping # Data collection control
|
||||
├── /trigger # Manual data collection
|
||||
└── /status # Scraping status
|
||||
```
|
||||
|
||||
### **API Models**
|
||||
- **Request Models**: Station creation/update, query parameters
|
||||
- **Response Models**: Station info, measurements, health status
|
||||
- **Error Models**: Standardized error responses
|
||||
|
||||
## 🗄️ Database Architecture
|
||||
|
||||
### **Supported Databases**
|
||||
1. **SQLite** - Local development and testing
|
||||
2. **MySQL** - Traditional relational database
|
||||
3. **PostgreSQL** - Advanced relational with TimescaleDB support
|
||||
4. **InfluxDB** - Purpose-built time-series database
|
||||
5. **VictoriaMetrics** - High-performance metrics storage
|
||||
|
||||
### **Schema Design**
|
||||
```sql
|
||||
-- Stations table
|
||||
stations (
|
||||
id INTEGER PRIMARY KEY,
|
||||
station_code VARCHAR(10) UNIQUE,
|
||||
thai_name VARCHAR(255),
|
||||
english_name VARCHAR(255),
|
||||
latitude DECIMAL(10,8),
|
||||
longitude DECIMAL(11,8),
|
||||
geohash VARCHAR(20),
|
||||
status VARCHAR(20),
|
||||
created_at TIMESTAMP,
|
||||
updated_at TIMESTAMP
|
||||
)
|
||||
|
||||
-- Measurements table
|
||||
water_measurements (
|
||||
id BIGINT PRIMARY KEY,
|
||||
timestamp DATETIME,
|
||||
station_id INTEGER,
|
||||
water_level DECIMAL(10,3),
|
||||
discharge DECIMAL(10,2),
|
||||
discharge_percent DECIMAL(5,2),
|
||||
status VARCHAR(20),
|
||||
created_at TIMESTAMP,
|
||||
FOREIGN KEY (station_id) REFERENCES stations(id),
|
||||
UNIQUE(timestamp, station_id)
|
||||
)
|
||||
```
|
||||
|
||||
## 🐳 Docker Architecture
|
||||
|
||||
### **Multi-Stage Build**
|
||||
1. **Builder Stage** - Compile dependencies and build artifacts
|
||||
2. **Production Stage** - Minimal runtime environment
|
||||
|
||||
### **Service Composition**
|
||||
- **ping-river-monitor** - Data collection service
|
||||
- **ping-river-api** - Web API service
|
||||
- **victoriametrics** - Time-series database
|
||||
- **grafana** - Visualization dashboard
|
||||
|
||||
## 📊 Monitoring Architecture
|
||||
|
||||
### **Metrics Collection**
|
||||
- **Counters** - API requests, database operations, scraping cycles
|
||||
- **Gauges** - Current values, connection status, resource usage
|
||||
- **Histograms** - Response times, processing durations
|
||||
|
||||
### **Health Checks**
|
||||
- **Database Health** - Connection status, data freshness
|
||||
- **API Health** - External API availability, response times
|
||||
- **System Health** - Memory usage, disk space, CPU load
|
||||
|
||||
### **Logging Levels**
|
||||
- **DEBUG** - Detailed execution information
|
||||
- **INFO** - General operational messages
|
||||
- **WARNING** - Potential issues and recoverable errors
|
||||
- **ERROR** - Serious problems requiring attention
|
||||
- **CRITICAL** - System-threatening issues
|
||||
|
||||
## 🔧 Configuration Management
|
||||
|
||||
### **Environment Variables**
|
||||
```bash
|
||||
# Database
|
||||
DB_TYPE=victoriametrics
|
||||
VM_HOST=localhost
|
||||
VM_PORT=8428
|
||||
|
||||
# Application
|
||||
SCRAPING_INTERVAL_HOURS=1
|
||||
LOG_LEVEL=INFO
|
||||
DATA_RETENTION_DAYS=365
|
||||
|
||||
# Security
|
||||
SECRET_KEY=your-secret-key
|
||||
API_KEY=your-api-key
|
||||
```
|
||||
|
||||
### **Configuration Hierarchy**
|
||||
1. Environment variables (highest priority)
|
||||
2. .env file
|
||||
3. Default values in config.py (lowest priority)
|
||||
|
||||
## 🧪 Testing Architecture
|
||||
|
||||
### **Test Categories**
|
||||
- **Unit Tests** - Individual component testing
|
||||
- **Integration Tests** - System component interaction
|
||||
- **API Tests** - Endpoint functionality and responses
|
||||
- **Performance Tests** - Load and stress testing
|
||||
|
||||
### **Test Data**
|
||||
- **Mock Data** - Simulated API responses
|
||||
- **Test Database** - Isolated test environment
|
||||
- **Fixtures** - Reusable test data sets
|
||||
|
||||
## 📦 Deployment Architecture
|
||||
|
||||
### **Development**
|
||||
```bash
|
||||
python run.py --web-api # Local development server
|
||||
```
|
||||
|
||||
### **Production**
|
||||
```bash
|
||||
docker-compose up -d # Full stack deployment
|
||||
```
|
||||
|
||||
### **CI/CD Pipeline**
|
||||
1. **Test Stage** - Run all tests and quality checks
|
||||
2. **Build Stage** - Create Docker images
|
||||
3. **Deploy Stage** - Deploy to staging/production
|
||||
4. **Health Check** - Verify deployment success
|
||||
|
||||
## 🔒 Security Architecture
|
||||
|
||||
### **Input Validation**
|
||||
- Pydantic models for API requests
|
||||
- Data range validation for measurements
|
||||
- SQL injection prevention through ORM
|
||||
|
||||
### **Authentication** (Future)
|
||||
- API key authentication
|
||||
- JWT token support
|
||||
- Role-based access control
|
||||
|
||||
### **Data Protection**
|
||||
- Environment variable configuration
|
||||
- Sensitive data masking in logs
|
||||
- HTTPS support for production
|
||||
|
||||
## 📈 Performance Architecture
|
||||
|
||||
### **Optimization Strategies**
|
||||
- Database connection pooling
|
||||
- Query optimization and indexing
|
||||
- Response caching for static data
|
||||
- Async processing for I/O operations
|
||||
|
||||
### **Scalability Considerations**
|
||||
- Horizontal scaling with load balancers
|
||||
- Database read replicas
|
||||
- Microservice architecture readiness
|
||||
- Container orchestration support
|
||||
|
||||
## 🔄 Data Flow Architecture
|
||||
|
||||
### **Collection Flow**
|
||||
```
|
||||
External API → Rate Limiter → Data Validator → Database Adapter → Database
|
||||
```
|
||||
|
||||
### **API Flow**
|
||||
```
|
||||
HTTP Request → FastAPI → Business Logic → Database Adapter → HTTP Response
|
||||
```
|
||||
|
||||
### **Monitoring Flow**
|
||||
```
|
||||
Application Events → Metrics Collector → Health Checks → Monitoring Dashboard
|
||||
```
|
||||
|
||||
This architecture provides a solid foundation for a production-ready water monitoring system with excellent maintainability, scalability, and observability.
|
||||
241
docs/STATION_MANAGEMENT_GUIDE.md
Normal file
241
docs/STATION_MANAGEMENT_GUIDE.md
Normal file
@@ -0,0 +1,241 @@
|
||||
# 🏔️ Station Management Guide - Northern Thailand Ping River Monitor
|
||||
|
||||
## 🎯 **Overview**
|
||||
|
||||
The Northern Thailand Ping River Monitor now includes comprehensive station management capabilities, allowing you to dynamically add, update, and remove monitoring stations through the web API.
|
||||
|
||||
## 🌊 **Current Coverage**
|
||||
|
||||
The system currently monitors **16 water stations** along the Ping River Basin:
|
||||
|
||||
### **Upper Ping River (Chiang Mai Province)**
|
||||
- **P.20** - Ban Chiang Dao (บ้านเชียงดาว)
|
||||
- **P.75** - Ban Chai Lat (บ้านช่อแล)
|
||||
- **P.92** - Ban Muang Aut (บ้านเมืองกึ๊ด)
|
||||
- **P.4A** - Ban Mae Taeng (บ้านแม่แตง)
|
||||
- **P.67** - Ban Tae (บ้านแม่แต)
|
||||
- **P.21** - Ban Rim Tai (บ้านริมใต้)
|
||||
- **P.103** - Ring Bridge 3 (สะพานวงแหวนรอบ 3)
|
||||
|
||||
### **Middle Ping River**
|
||||
- **P.1** - Nawarat Bridge (สะพานนวรัฐ) - *Main reference station*
|
||||
- **P.82** - Ban Sob win (บ้านสบวิน)
|
||||
- **P.84** - Ban Panton (บ้านพันตน)
|
||||
- **P.81** - Ban Pong (บ้านโป่ง)
|
||||
- **P.5** - Tha Nang Bridge (สะพานท่านาง)
|
||||
|
||||
### **Lower Ping River**
|
||||
- **P.77** - Baan Sop Mae Sapuord (บ้านสบแม่สะป๊วด)
|
||||
- **P.87** - Ban Pa Sang (บ้านป่าซาง)
|
||||
- **P.76** - Banb Mae I Hai (บ้านแม่อีไฮ)
|
||||
- **P.85** - Baan Lai Kaew (บ้านหล่ายแก้ว)
|
||||
|
||||
## 🔧 **Station Management API**
|
||||
|
||||
### **List All Stations**
|
||||
```bash
|
||||
GET /stations
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```json
|
||||
[
|
||||
{
|
||||
"station_id": 1,
|
||||
"station_code": "P.20",
|
||||
"thai_name": "บ้านเชียงดาว",
|
||||
"english_name": "Ban Chiang Dao",
|
||||
"latitude": 19.36731448032191,
|
||||
"longitude": 98.9688487015384,
|
||||
"geohash": null,
|
||||
"status": "active"
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
### **Get Specific Station**
|
||||
```bash
|
||||
GET /stations/{station_id}
|
||||
```
|
||||
|
||||
### **Add New Station**
|
||||
```bash
|
||||
POST /stations
|
||||
Content-Type: application/json
|
||||
|
||||
{
|
||||
"station_code": "P.NEW",
|
||||
"thai_name": "สถานีใหม่",
|
||||
"english_name": "New Station",
|
||||
"latitude": 18.7875,
|
||||
"longitude": 99.0045,
|
||||
"geohash": "w5q6uuhvfcfp25",
|
||||
"status": "active"
|
||||
}
|
||||
```
|
||||
|
||||
### **Update Station Information**
|
||||
```bash
|
||||
PUT /stations/{station_id}
|
||||
Content-Type: application/json
|
||||
|
||||
{
|
||||
"thai_name": "ชื่อใหม่",
|
||||
"english_name": "Updated Name",
|
||||
"latitude": 18.8000,
|
||||
"longitude": 99.0100
|
||||
}
|
||||
```
|
||||
|
||||
### **Delete Station**
|
||||
```bash
|
||||
DELETE /stations/{station_id}
|
||||
```
|
||||
|
||||
## 🧪 **Testing Station Management**
|
||||
|
||||
Use the provided test script to verify station management functionality:
|
||||
|
||||
```bash
|
||||
# Test all station management endpoints
|
||||
python test_station_management.py
|
||||
```
|
||||
|
||||
This will:
|
||||
1. List existing stations
|
||||
2. Create a test station
|
||||
3. Retrieve station details
|
||||
4. Update station information
|
||||
5. Verify changes
|
||||
6. Delete the test station
|
||||
7. Confirm deletion
|
||||
|
||||
## 📊 **Station Data Model**
|
||||
|
||||
### **Required Fields**
|
||||
- `station_code`: Unique identifier (e.g., "P.1", "P.20")
|
||||
- `thai_name`: Thai language name
|
||||
- `english_name`: English language name
|
||||
|
||||
### **Optional Fields**
|
||||
- `latitude`: GPS latitude coordinate (-90 to 90)
|
||||
- `longitude`: GPS longitude coordinate (-180 to 180)
|
||||
- `geohash`: Geohash string for location
|
||||
- `status`: Station status ("active", "inactive", "maintenance", "error")
|
||||
|
||||
### **Validation Rules**
|
||||
- Station codes must be unique
|
||||
- Latitude must be between -90 and 90
|
||||
- Longitude must be between -180 and 180
|
||||
- Names cannot be empty
|
||||
- Status must be valid enum value
|
||||
|
||||
## 🌐 **Web Interface**
|
||||
|
||||
Access the station management interface through the web dashboard:
|
||||
|
||||
1. **Start the API server:**
|
||||
```bash
|
||||
python run.py --web-api
|
||||
```
|
||||
|
||||
2. **Open your browser:**
|
||||
- Dashboard: http://localhost:8000
|
||||
- API Documentation: http://localhost:8000/docs
|
||||
|
||||
3. **Use the interactive API docs** to test station management endpoints
|
||||
|
||||
## 🔄 **Integration with Data Collection**
|
||||
|
||||
- **Dynamic Station Discovery**: New stations are automatically included in data collection
|
||||
- **Real-time Updates**: Station information changes are reflected immediately
|
||||
- **Data Continuity**: Historical data is preserved when updating station details
|
||||
- **Error Handling**: Invalid stations are skipped during data collection
|
||||
|
||||
## 📍 **Geographic Coverage**
|
||||
|
||||
The Ping River Basin monitoring network covers:
|
||||
|
||||
- **Total Distance**: ~400 km from Chiang Dao to Nakhon Sawan
|
||||
- **Elevation Range**: 300m to 1,200m above sea level
|
||||
- **Catchment Area**: ~25,000 km²
|
||||
- **Major Cities**: Chiang Mai, Lamphun, Tak, Nakhon Sawan
|
||||
|
||||
## 🚀 **Usage Examples**
|
||||
|
||||
### **Add a New Upstream Station**
|
||||
```bash
|
||||
curl -X POST "http://localhost:8000/stations" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"station_code": "P.UPSTREAM",
|
||||
"thai_name": "สถานีต้นน้ำ",
|
||||
"english_name": "Upstream Station",
|
||||
"latitude": 19.5000,
|
||||
"longitude": 98.9000,
|
||||
"status": "active"
|
||||
}'
|
||||
```
|
||||
|
||||
### **Update Station Coordinates**
|
||||
```bash
|
||||
curl -X PUT "http://localhost:8000/stations/1" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"latitude": 19.3700,
|
||||
"longitude": 98.9700
|
||||
}'
|
||||
```
|
||||
|
||||
### **Mark Station for Maintenance**
|
||||
```bash
|
||||
curl -X PUT "http://localhost:8000/stations/5" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"status": "maintenance"
|
||||
}'
|
||||
```
|
||||
|
||||
## 🔒 **Best Practices**
|
||||
|
||||
### **Station Naming**
|
||||
- Use consistent code format (P.XX)
|
||||
- Include both Thai and English names
|
||||
- Use descriptive location names
|
||||
|
||||
### **Coordinate Accuracy**
|
||||
- Use high-precision GPS coordinates (6+ decimal places)
|
||||
- Verify coordinates match actual station location
|
||||
- Include geohash for efficient spatial queries
|
||||
|
||||
### **Status Management**
|
||||
- Set status to "maintenance" during repairs
|
||||
- Use "inactive" for temporarily offline stations
|
||||
- Use "error" for stations with data quality issues
|
||||
|
||||
### **Data Integrity**
|
||||
- Test new stations before adding to production
|
||||
- Backup station configuration before major changes
|
||||
- Monitor data quality after station updates
|
||||
|
||||
## 🎯 **Future Enhancements**
|
||||
|
||||
Planned improvements for station management:
|
||||
|
||||
1. **Bulk Operations** - Import/export multiple stations
|
||||
2. **Station Groups** - Organize stations by river section
|
||||
3. **Automated Validation** - GPS coordinate verification
|
||||
4. **Historical Tracking** - Track station configuration changes
|
||||
5. **Alert Integration** - Notifications for station status changes
|
||||
6. **Map Interface** - Visual station management on interactive map
|
||||
|
||||
## 📞 **Support**
|
||||
|
||||
For station management issues:
|
||||
|
||||
1. Check the API documentation at `/docs`
|
||||
2. Run the test script: `python test_station_management.py`
|
||||
3. Review logs for error details
|
||||
4. Verify station data format and validation rules
|
||||
|
||||
The station management system provides flexible control over your monitoring network while maintaining data integrity and system reliability.
|
||||
443
docs/VICTORIAMETRICS_SETUP.md
Normal file
443
docs/VICTORIAMETRICS_SETUP.md
Normal file
@@ -0,0 +1,443 @@
|
||||
# VictoriaMetrics Setup Guide for Thailand Water Monitor
|
||||
|
||||
This guide provides comprehensive instructions for setting up VictoriaMetrics as the time-series database backend for the Thailand Water Monitor.
|
||||
|
||||
## Why VictoriaMetrics?
|
||||
|
||||
VictoriaMetrics is an excellent choice for water monitoring data because:
|
||||
|
||||
- **High Performance**: Up to 10x faster than InfluxDB
|
||||
- **Low Resource Usage**: Uses 10x less RAM than Prometheus
|
||||
- **Better Compression**: 70x better compression than Prometheus
|
||||
- **Prometheus Compatible**: Drop-in replacement for Prometheus
|
||||
- **Easy to Deploy**: Single binary, no dependencies
|
||||
- **Cost Effective**: Open source with commercial support available
|
||||
|
||||
## Quick Start
|
||||
|
||||
### 1. Environment Variables
|
||||
|
||||
Set these environment variables to configure VictoriaMetrics:
|
||||
|
||||
```bash
|
||||
# Windows (PowerShell)
|
||||
$env:DB_TYPE="victoriametrics"
|
||||
$env:VM_HOST="localhost"
|
||||
$env:VM_PORT="8428"
|
||||
|
||||
# Linux/Mac
|
||||
export DB_TYPE=victoriametrics
|
||||
export VM_HOST=localhost
|
||||
export VM_PORT=8428
|
||||
```
|
||||
|
||||
### 2. Start VictoriaMetrics with Docker
|
||||
|
||||
```bash
|
||||
# Simple setup
|
||||
docker run -d \
|
||||
--name victoriametrics \
|
||||
-p 8428:8428 \
|
||||
-v victoria-metrics-data:/victoria-metrics-data \
|
||||
victoriametrics/victoria-metrics:latest \
|
||||
--storageDataPath=/victoria-metrics-data \
|
||||
--retentionPeriod=2y \
|
||||
--httpListenAddr=:8428
|
||||
|
||||
# Verify it's running
|
||||
curl http://localhost:8428/health
|
||||
```
|
||||
|
||||
### 3. Run the Water Monitor
|
||||
|
||||
```bash
|
||||
python water_scraper_v3.py
|
||||
```
|
||||
|
||||
### 4. Access Grafana Dashboard
|
||||
|
||||
```bash
|
||||
# Start with Docker Compose (includes Grafana)
|
||||
docker-compose -f docker-compose.victoriametrics.yml up -d
|
||||
|
||||
# Access Grafana at http://localhost:3000
|
||||
# Username: admin
|
||||
# Password: admin_password
|
||||
```
|
||||
|
||||
## Production Setup
|
||||
|
||||
### Docker Compose Configuration
|
||||
|
||||
Use the provided `docker-compose.victoriametrics.yml` file:
|
||||
|
||||
```bash
|
||||
# Start the complete stack
|
||||
docker-compose -f docker-compose.victoriametrics.yml up -d
|
||||
|
||||
# Check status
|
||||
docker-compose -f docker-compose.victoriametrics.yml ps
|
||||
|
||||
# View logs
|
||||
docker-compose -f docker-compose.victoriametrics.yml logs -f
|
||||
```
|
||||
|
||||
### Manual VictoriaMetrics Configuration
|
||||
|
||||
#### High-Performance Configuration
|
||||
|
||||
```bash
|
||||
docker run -d \
|
||||
--name victoriametrics \
|
||||
-p 8428:8428 \
|
||||
-v victoria-metrics-data:/victoria-metrics-data \
|
||||
victoriametrics/victoria-metrics:latest \
|
||||
--storageDataPath=/victoria-metrics-data \
|
||||
--retentionPeriod=2y \
|
||||
--httpListenAddr=:8428 \
|
||||
--maxConcurrentInserts=32 \
|
||||
--search.maxQueryDuration=60s \
|
||||
--search.maxConcurrentRequests=16 \
|
||||
--dedup.minScrapeInterval=30s \
|
||||
--memory.allowedPercent=80 \
|
||||
--loggerLevel=INFO \
|
||||
--loggerFormat=json \
|
||||
--search.maxSeries=1000000 \
|
||||
--search.maxPointsPerTimeseries=100000
|
||||
```
|
||||
|
||||
#### Configuration Parameters Explained
|
||||
|
||||
| Parameter | Description | Recommended Value |
|
||||
|-----------|-------------|-------------------|
|
||||
| `--storageDataPath` | Data storage directory | `/victoria-metrics-data` |
|
||||
| `--retentionPeriod` | How long to keep data | `2y` (2 years) |
|
||||
| `--httpListenAddr` | HTTP listen address | `:8428` |
|
||||
| `--maxConcurrentInserts` | Max concurrent inserts | `32` |
|
||||
| `--search.maxQueryDuration` | Max query duration | `60s` |
|
||||
| `--search.maxConcurrentRequests` | Max concurrent queries | `16` |
|
||||
| `--dedup.minScrapeInterval` | Deduplication interval | `30s` |
|
||||
| `--memory.allowedPercent` | Max memory usage | `80` |
|
||||
| `--loggerLevel` | Log level | `INFO` |
|
||||
| `--search.maxSeries` | Max time series | `1000000` |
|
||||
|
||||
### Kubernetes Deployment
|
||||
|
||||
```yaml
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: victoriametrics
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: victoriametrics
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: victoriametrics
|
||||
spec:
|
||||
containers:
|
||||
- name: victoriametrics
|
||||
image: victoriametrics/victoria-metrics:latest
|
||||
ports:
|
||||
- containerPort: 8428
|
||||
args:
|
||||
- --storageDataPath=/victoria-metrics-data
|
||||
- --retentionPeriod=2y
|
||||
- --httpListenAddr=:8428
|
||||
- --maxConcurrentInserts=32
|
||||
volumeMounts:
|
||||
- name: storage
|
||||
mountPath: /victoria-metrics-data
|
||||
resources:
|
||||
requests:
|
||||
memory: "512Mi"
|
||||
cpu: "500m"
|
||||
limits:
|
||||
memory: "2Gi"
|
||||
cpu: "2000m"
|
||||
volumes:
|
||||
- name: storage
|
||||
persistentVolumeClaim:
|
||||
claimName: victoriametrics-pvc
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: victoriametrics
|
||||
spec:
|
||||
selector:
|
||||
app: victoriametrics
|
||||
ports:
|
||||
- port: 8428
|
||||
targetPort: 8428
|
||||
type: ClusterIP
|
||||
```
|
||||
|
||||
## Data Queries
|
||||
|
||||
### HTTP API Queries
|
||||
|
||||
VictoriaMetrics provides a Prometheus-compatible HTTP API:
|
||||
|
||||
```bash
|
||||
# Current water levels for all stations
|
||||
curl "http://localhost:8428/api/v1/query?query=water_level"
|
||||
|
||||
# Water levels for specific station
|
||||
curl "http://localhost:8428/api/v1/query?query=water_level{station_code=\"P.1\"}"
|
||||
|
||||
# Average discharge over last hour
|
||||
curl "http://localhost:8428/api/v1/query?query=avg_over_time(water_discharge[1h])"
|
||||
|
||||
# High discharge alerts (>80%)
|
||||
curl "http://localhost:8428/api/v1/query?query=water_discharge_percent>80"
|
||||
|
||||
# Time range query (last 6 hours)
|
||||
START=$(date -d '6 hours ago' +%s)
|
||||
END=$(date +%s)
|
||||
curl "http://localhost:8428/api/v1/query_range?query=water_level&start=${START}&end=${END}&step=300"
|
||||
```
|
||||
|
||||
### PromQL Examples
|
||||
|
||||
```promql
|
||||
# Current water levels
|
||||
water_level
|
||||
|
||||
# Water level trends (last 24h)
|
||||
water_level[24h]
|
||||
|
||||
# Discharge rates by station
|
||||
water_discharge{station_code="P.1"}
|
||||
|
||||
# Average discharge across all stations
|
||||
avg(water_discharge)
|
||||
|
||||
# Stations with high discharge (>80%)
|
||||
water_discharge_percent > 80
|
||||
|
||||
# Rate of change in water level
|
||||
rate(water_level[5m])
|
||||
|
||||
# Maximum water level in last hour
|
||||
max_over_time(water_level[1h])
|
||||
|
||||
# Stations with increasing water levels
|
||||
increase(water_level[1h]) > 0
|
||||
```
|
||||
|
||||
## Grafana Integration
|
||||
|
||||
### Data Source Configuration
|
||||
|
||||
1. **Add VictoriaMetrics as Prometheus Data Source**:
|
||||
- URL: `http://localhost:8428` (or `http://victoriametrics:8428` in Docker)
|
||||
- Access: Server (default)
|
||||
- HTTP Method: POST
|
||||
|
||||
2. **Import Dashboard**:
|
||||
- Use the provided `water-monitoring-dashboard.json`
|
||||
- Or create custom dashboards with the queries above
|
||||
|
||||
### Dashboard Panels
|
||||
|
||||
The included dashboard provides:
|
||||
|
||||
- **Time Series**: Water levels and discharge over time
|
||||
- **Table**: Current status of all stations
|
||||
- **Pie Chart**: Discharge percentage distribution
|
||||
- **Gauge**: Average discharge percentage
|
||||
- **Variables**: Filter by station
|
||||
|
||||
## Monitoring and Maintenance
|
||||
|
||||
### Health Checks
|
||||
|
||||
```bash
|
||||
# Check VictoriaMetrics health
|
||||
curl http://localhost:8428/health
|
||||
|
||||
# Check metrics endpoint
|
||||
curl http://localhost:8428/metrics
|
||||
|
||||
# Check configuration
|
||||
curl http://localhost:8428/api/v1/status/config
|
||||
```
|
||||
|
||||
### Performance Monitoring
|
||||
|
||||
```bash
|
||||
# Query performance stats
|
||||
curl http://localhost:8428/api/v1/status/tsdb
|
||||
|
||||
# Memory usage
|
||||
curl http://localhost:8428/api/v1/status/runtime
|
||||
|
||||
# Active queries
|
||||
curl http://localhost:8428/api/v1/status/active_queries
|
||||
```
|
||||
|
||||
### Backup and Restore
|
||||
|
||||
```bash
|
||||
# Create backup
|
||||
docker exec victoriametrics /usr/bin/vmbackup \
|
||||
-storageDataPath=/victoria-metrics-data \
|
||||
-dst=fs:///backup/$(date +%Y%m%d)
|
||||
|
||||
# Restore from backup
|
||||
docker exec victoriametrics /usr/bin/vmrestore \
|
||||
-src=fs:///backup/20250724 \
|
||||
-storageDataPath=/victoria-metrics-data
|
||||
```
|
||||
|
||||
### Log Analysis
|
||||
|
||||
```bash
|
||||
# View logs
|
||||
docker logs victoriametrics
|
||||
|
||||
# Follow logs
|
||||
docker logs -f victoriametrics
|
||||
|
||||
# Search for errors
|
||||
docker logs victoriametrics 2>&1 | grep ERROR
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
1. **Connection Refused**:
|
||||
```bash
|
||||
# Check if VictoriaMetrics is running
|
||||
docker ps | grep victoriametrics
|
||||
|
||||
# Check port binding
|
||||
netstat -tlnp | grep 8428
|
||||
```
|
||||
|
||||
2. **High Memory Usage**:
|
||||
```bash
|
||||
# Reduce memory limit
|
||||
docker run ... --memory.allowedPercent=60 ...
|
||||
```
|
||||
|
||||
3. **Slow Queries**:
|
||||
```bash
|
||||
# Increase query timeout
|
||||
docker run ... --search.maxQueryDuration=120s ...
|
||||
```
|
||||
|
||||
4. **Data Not Appearing**:
|
||||
```bash
|
||||
# Check if data is being written
|
||||
curl "http://localhost:8428/api/v1/query?query=up"
|
||||
|
||||
# Check water monitor logs
|
||||
tail -f water_monitor.log
|
||||
```
|
||||
|
||||
### Performance Tuning
|
||||
|
||||
1. **For High Write Load**:
|
||||
```bash
|
||||
--maxConcurrentInserts=64
|
||||
--insert.maxQueueDuration=60s
|
||||
```
|
||||
|
||||
2. **For High Query Load**:
|
||||
```bash
|
||||
--search.maxConcurrentRequests=32
|
||||
--search.maxQueryDuration=120s
|
||||
```
|
||||
|
||||
3. **For Large Datasets**:
|
||||
```bash
|
||||
--search.maxSeries=10000000
|
||||
--search.maxPointsPerTimeseries=1000000
|
||||
```
|
||||
|
||||
## Security
|
||||
|
||||
### Authentication
|
||||
|
||||
VictoriaMetrics doesn't have built-in authentication. Use a reverse proxy:
|
||||
|
||||
```nginx
|
||||
server {
|
||||
listen 80;
|
||||
server_name victoriametrics.example.com;
|
||||
|
||||
auth_basic "VictoriaMetrics";
|
||||
auth_basic_user_file /etc/nginx/.htpasswd;
|
||||
|
||||
location / {
|
||||
proxy_pass http://localhost:8428;
|
||||
proxy_set_header Host $host;
|
||||
proxy_set_header X-Real-IP $remote_addr;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### TLS/SSL
|
||||
|
||||
```bash
|
||||
# Use nginx or traefik for TLS termination
|
||||
# Or use VictoriaMetrics with TLS:
|
||||
docker run ... \
|
||||
-v /path/to/cert.pem:/cert.pem \
|
||||
-v /path/to/key.pem:/key.pem \
|
||||
victoriametrics/victoria-metrics:latest \
|
||||
--tls \
|
||||
--tlsCertFile=/cert.pem \
|
||||
--tlsKeyFile=/key.pem
|
||||
```
|
||||
|
||||
## Scaling
|
||||
|
||||
### Cluster Setup
|
||||
|
||||
For high availability and horizontal scaling:
|
||||
|
||||
```bash
|
||||
# Start multiple VictoriaMetrics instances
|
||||
docker run -d --name vm1 -p 8428:8428 victoriametrics/victoria-metrics:latest
|
||||
docker run -d --name vm2 -p 8429:8428 victoriametrics/victoria-metrics:latest
|
||||
|
||||
# Use load balancer to distribute queries
|
||||
# Use vminsert/vmselect/vmstorage for true clustering
|
||||
```
|
||||
|
||||
### Resource Requirements
|
||||
|
||||
| Data Points/Hour | RAM | CPU | Storage/Day |
|
||||
|------------------|-----|-----|-------------|
|
||||
| 1,000 | 100MB | 0.1 CPU | 10MB |
|
||||
| 10,000 | 500MB | 0.5 CPU | 100MB |
|
||||
| 100,000 | 2GB | 1 CPU | 1GB |
|
||||
| 1,000,000 | 8GB | 2 CPU | 10GB |
|
||||
|
||||
## Migration
|
||||
|
||||
### From InfluxDB
|
||||
|
||||
```bash
|
||||
# Export from InfluxDB
|
||||
influx -database water_monitoring -execute "SELECT * FROM water_data" -format csv > data.csv
|
||||
|
||||
# Import to VictoriaMetrics (convert to Prometheus format first)
|
||||
# Use vmctl tool for migration
|
||||
```
|
||||
|
||||
### From Prometheus
|
||||
|
||||
```bash
|
||||
# Use vmctl for direct migration
|
||||
vmctl prometheus --prom-snapshot=/path/to/prometheus/data --vm-addr=http://localhost:8428
|
||||
```
|
||||
|
||||
This comprehensive setup guide should help you configure VictoriaMetrics for optimal performance with the Thailand Water Monitor system.
|
||||
179
docs/references/NOTABLE_DOCUMENTS.md
Normal file
179
docs/references/NOTABLE_DOCUMENTS.md
Normal file
@@ -0,0 +1,179 @@
|
||||
# Notable Documents and References
|
||||
|
||||
This document contains important references and external resources related to the Thailand Water Level Monitoring System.
|
||||
|
||||
## 🌊 **Official Thai Government Water Resources**
|
||||
|
||||
### **Royal Irrigation Department (RID) Resources**
|
||||
|
||||
#### **1. Water Level Monitoring Diagram**
|
||||
- **URL**: https://water.rid.go.th/hyd/Diagram/graphic_ping.pdf
|
||||
- **Description**: Official diagram showing the water level monitoring network structure
|
||||
- **Content**: Technical diagrams and network topology for Thailand's water monitoring system
|
||||
- **Language**: Thai
|
||||
- **Format**: PDF
|
||||
- **Usage**: Understanding the official monitoring infrastructure and station relationships
|
||||
|
||||
#### **2. Hourly Water Level Data Portal**
|
||||
- **URL**: https://hyd-app-db.rid.go.th/hydro1h.html
|
||||
- **Description**: Real-time hourly water level data web interface
|
||||
- **Content**: Live data from all 16 monitoring stations across Thailand
|
||||
- **Language**: Thai
|
||||
- **Format**: Web Application
|
||||
- **Usage**: Primary data source for the monitoring system
|
||||
- **API Endpoint**: Used by our scraper to fetch real-time data
|
||||
- **Update Frequency**: Hourly updates
|
||||
- **Data Points**: ~240-384 measurements per hour across all stations
|
||||
|
||||
#### **3. Individual Station Data - P.76 Example**
|
||||
- **URL**: https://www.hydro-1.net/Data/STATION/P.76.html
|
||||
- **Description**: Detailed individual station data page for station P.76
|
||||
- **Content**: Historical data, station details, and specific measurements
|
||||
- **Language**: Thai/English
|
||||
- **Format**: Web Page
|
||||
- **Usage**: Reference for individual station characteristics and historical data patterns
|
||||
- **Station**: P.76 - บ้านแม่อีไฮ (Banb Mae I Hai)
|
||||
|
||||
## 📊 **Data Sources and APIs**
|
||||
|
||||
### **Primary Data Source**
|
||||
- **API Endpoint**: `https://hyd-app-db.rid.go.th/webservice/getGroupHourlyWaterLevelReportAllHL.ashx`
|
||||
- **Method**: POST
|
||||
- **Data Format**: JSON
|
||||
- **Update Schedule**: Hourly (top of each hour)
|
||||
- **Coverage**: All 16 monitoring stations
|
||||
- **Metrics**: Water level (m), Discharge (cms), Discharge percentage (%)
|
||||
|
||||
### **Station Coverage**
|
||||
The system monitors 16 stations across Thailand:
|
||||
- P.1 - สะพานนวรัฐ (Nawarat Bridge)
|
||||
- P.5 - สะพานท่านาง (Tha Nang Bridge)
|
||||
- P.20 - บ้านเชียงดาว (Ban Chiang Dao)
|
||||
- P.21 - บ้านริมใต้ (Ban Rim Tai)
|
||||
- P.4A - บ้านแม่แตง (Ban Mae Taeng)
|
||||
- P.67 - บ้านแม่แต (Ban Tae)
|
||||
- P.75 - บ้านช่อแล (Ban Chai Lat)
|
||||
- P.76 - บ้านแม่อีไฮ (Banb Mae I Hai)
|
||||
- P.77 - บ้านสบแม่สะป๊วด (Baan Sop Mae Sapuord)
|
||||
- P.81 - บ้านโป่ง (Ban Pong)
|
||||
- P.82 - บ้านสบวิน (Ban Sob win)
|
||||
- P.84 - บ้านพันตน (Ban Panton)
|
||||
- P.85 - บ้านหล่ายแก้ว (Baan Lai Kaew)
|
||||
- P.87 - บ้านป่าซาง (Ban Pa Sang)
|
||||
- P.92 - บ้านเมืองกึ๊ด (Ban Muang Aut)
|
||||
- P.103 - สะพานวงแหวนรอบ 3 (Ring Bridge 3)
|
||||
|
||||
## 🔗 **Related Resources**
|
||||
|
||||
### **Technical Documentation**
|
||||
- **Thai Water Resources**: https://water.rid.go.th/
|
||||
- **Hydro Information Network**: https://www.hydro-1.net/
|
||||
- **Royal Irrigation Department**: https://www.rid.go.th/
|
||||
|
||||
### **Data Standards**
|
||||
- **Time Format**: Thai Buddhist calendar (BE) + 24-hour format
|
||||
- **Coordinate System**: WGS84 decimal degrees
|
||||
- **Water Level Units**: Meters (m)
|
||||
- **Discharge Units**: Cubic meters per second (cms)
|
||||
- **Update Frequency**: Hourly at :00 minutes
|
||||
|
||||
### **API Parameters**
|
||||
```javascript
|
||||
{
|
||||
'DW[UtokID]': '1',
|
||||
'DW[BasinID]': '6',
|
||||
'DW[TimeCurrent]': 'DD/MM/YYYY', // Thai Buddhist calendar
|
||||
'_search': 'false',
|
||||
'nd': timestamp_milliseconds,
|
||||
'rows': '100',
|
||||
'page': '1',
|
||||
'sidx': 'indexhourly',
|
||||
'sord': 'asc'
|
||||
}
|
||||
```
|
||||
|
||||
## 📋 **Data Structure Reference**
|
||||
|
||||
### **JSON Response Format**
|
||||
```json
|
||||
{
|
||||
"rows": [
|
||||
{
|
||||
"hourlytime": "1.00", // Hour (1-24, where 24 = midnight next day)
|
||||
"wlvalues1": "2.45", // Water level for station 1 (meters)
|
||||
"qvalues1": "125.3", // Discharge for station 1 (cms)
|
||||
"QPercent1": "45.2", // Discharge percentage for station 1
|
||||
"wlvalues2": "1.89", // Station 2 data...
|
||||
// ... continues for all 16 stations
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### **Station ID Mapping**
|
||||
- Station 1 → P.20 (Ban Chiang Dao)
|
||||
- Station 2 → P.75 (Ban Chai Lat)
|
||||
- Station 3 → P.92 (Ban Muang Aut)
|
||||
- Station 4 → P.4A (Ban Mae Taeng)
|
||||
- Station 5 → P.67 (Ban Tae)
|
||||
- Station 6 → P.21 (Ban Rim Tai)
|
||||
- Station 7 → P.103 (Ring Bridge 3)
|
||||
- Station 8 → P.1 (Nawarat Bridge)
|
||||
- Station 9 → P.82 (Ban Sob win)
|
||||
- Station 10 → P.84 (Ban Panton)
|
||||
- Station 11 → P.81 (Ban Pong)
|
||||
- Station 12 → P.5 (Tha Nang Bridge)
|
||||
- Station 13 → P.77 (Baan Sop Mae Sapuord)
|
||||
- Station 14 → P.87 (Ban Pa Sang)
|
||||
- Station 15 → P.76 (Banb Mae I Hai)
|
||||
- Station 16 → P.85 (Baan Lai Kaew)
|
||||
|
||||
## 🌐 **Geolocation Reference**
|
||||
|
||||
### **Sample Coordinates (P.1 - Nawarat Bridge)**
|
||||
- **Latitude**: 15.6944°N
|
||||
- **Longitude**: 100.2028°E
|
||||
- **Geohash**: w5q6uuhvfcfp25
|
||||
- **Location**: Nakhon Sawan Province, Thailand
|
||||
|
||||
### **Coordinate System**
|
||||
- **Datum**: WGS84
|
||||
- **Format**: Decimal degrees
|
||||
- **Precision**: 4 decimal places (~11m accuracy)
|
||||
- **Usage**: Grafana geomap visualization
|
||||
|
||||
## 📝 **Usage Notes**
|
||||
|
||||
### **Data Collection**
|
||||
- **Frequency**: Every 15 minutes (full check at :00, quick checks at :15, :30, :45)
|
||||
- **Retention**: 2+ years of historical data
|
||||
- **Gap Filling**: Automatic detection and filling of missing data
|
||||
- **Data Updates**: Checks for changed values in recent data
|
||||
|
||||
### **Time Handling**
|
||||
- **Thai Time**: UTC+7 (Asia/Bangkok)
|
||||
- **Buddhist Calendar**: Thai year = Gregorian year + 543
|
||||
- **Hour 24**: Represents midnight (00:00) of the next day
|
||||
- **API Format**: DD/MM/YYYY (Buddhist calendar)
|
||||
|
||||
### **Data Quality**
|
||||
- **Validation**: Automatic data validation and error detection
|
||||
- **Retry Logic**: 15-minute retry intervals when data is unavailable
|
||||
- **Error Handling**: Comprehensive error logging and recovery
|
||||
- **Monitoring**: Health checks and alert conditions
|
||||
|
||||
## 🔍 **Research and Development**
|
||||
|
||||
### **Future Enhancements**
|
||||
- **Additional Stations**: Potential expansion to more monitoring points
|
||||
- **Real-time Alerts**: Threshold-based notification system
|
||||
- **Predictive Analytics**: Water level forecasting capabilities
|
||||
- **Mobile Integration**: Field data collection and verification
|
||||
|
||||
### **Technical Improvements**
|
||||
- **API Optimization**: Enhanced data fetching efficiency
|
||||
- **Database Performance**: Query optimization and indexing
|
||||
- **Visualization**: Advanced Grafana dashboard features
|
||||
- **Integration**: Connection with other water management systems
|
||||
|
||||
This document serves as a comprehensive reference for understanding the data sources, technical specifications, and official resources that support the Thailand Water Level Monitoring System.
|
||||
Reference in New Issue
Block a user