Files
Northern-Thailand-Ping-Rive…/docs/PROJECT_STRUCTURE.md
grabowski af62cfef0b
Some checks failed
Security & Dependency Updates / Dependency Security Scan (push) Successful in 29s
Security & Dependency Updates / Docker Security Scan (push) Failing after 53s
Security & Dependency Updates / License Compliance (push) Successful in 13s
Security & Dependency Updates / Check for Dependency Updates (push) Successful in 19s
Security & Dependency Updates / Code Quality Metrics (push) Successful in 11s
Security & Dependency Updates / Security Summary (push) Successful in 7s
Initial commit: Northern Thailand Ping River Monitor v3.1.0
Features:
- Real-time water level monitoring for Ping River Basin (16 stations)
- Coverage from Chiang Dao to Nakhon Sawan in Northern Thailand
- FastAPI web interface with interactive dashboard and station management
- Multi-database support (SQLite, MySQL, PostgreSQL, InfluxDB, VictoriaMetrics)
- Comprehensive monitoring with health checks and metrics collection
- Docker deployment with Grafana integration
- Production-ready architecture with enterprise-grade observability

 CI/CD & Automation:
- Complete Gitea Actions workflows for CI/CD, security, and releases
- Multi-Python version testing (3.9-3.12)
- Multi-architecture Docker builds (amd64, arm64)
- Daily security scanning and dependency monitoring
- Automated documentation generation
- Performance testing and validation

 Production Ready:
- Type safety with Pydantic models and comprehensive type hints
- Data validation layer with range checking and error handling
- Rate limiting and request tracking for API protection
- Enhanced logging with rotation, colors, and performance metrics
- Station management API for dynamic CRUD operations
- Comprehensive documentation and deployment guides

 Technical Stack:
- Python 3.9+ with FastAPI and Pydantic
- Multi-database architecture with adapter pattern
- Docker containerization with multi-stage builds
- Grafana dashboards for visualization
- Gitea Actions for CI/CD automation
- Enterprise monitoring and alerting

 Ready for deployment to B4L infrastructure!
2025-08-12 15:40:24 +07:00

9.6 KiB

🏗️ Project Structure - Northern Thailand Ping River Monitor

📁 Directory Layout

Northern-Thailand-Ping-River-Monitor/
├── 📁 src/                          # Main application source code
│   ├── __init__.py                  # Package initialization
│   ├── main.py                      # CLI entry point and main application
│   ├── water_scraper_v3.py          # Core data collection engine
│   ├── web_api.py                   # FastAPI web interface
│   ├── config.py                    # Configuration management
│   ├── database_adapters.py         # Database abstraction layer
│   ├── models.py                    # Data models and type definitions
│   ├── exceptions.py                # Custom exception classes
│   ├── validators.py                # Data validation layer
│   ├── metrics.py                   # Metrics collection system
│   ├── health_check.py              # Health monitoring system
│   ├── rate_limiter.py              # Rate limiting and request tracking
│   └── logging_config.py            # Enhanced logging configuration
├── 📁 docs/                         # Documentation files
│   ├── STATION_MANAGEMENT_GUIDE.md  # Station management documentation
│   ├── ENHANCEMENT_SUMMARY.md       # Feature enhancement summary
│   └── PROJECT_STRUCTURE.md         # This file
├── 📁 scripts/                      # Utility scripts
│   └── migrate_geolocation.py       # Database migration script
├── 📁 grafana/                      # Grafana configuration
│   ├── dashboards/                  # Dashboard definitions
│   └── provisioning/                # Grafana provisioning config
├── 📁 tests/                        # Test files
│   ├── test_integration.py          # Integration test suite
│   ├── test_station_management.py   # Station management tests
│   └── test_api.py                  # API endpoint tests
├── 📄 run.py                        # Simple startup script
├── 📄 requirements.txt              # Production dependencies
├── 📄 requirements-dev.txt          # Development dependencies
├── 📄 setup.py                      # Package installation script
├── 📄 Dockerfile                    # Docker container definition
├── 📄 docker-compose.victoriametrics.yml  # Complete stack deployment
├── 📄 Makefile                      # Common development tasks
├── 📄 .env.example                  # Environment configuration template
├── 📄 .gitignore                    # Git ignore patterns
├── 📄 .gitlab-ci.yml                # CI/CD pipeline configuration
├── 📄 LICENSE                       # MIT license
├── 📄 README.md                     # Main project documentation
└── 📄 CONTRIBUTING.md               # Contribution guidelines

🔧 Core Components

Application Layer

  • src/main.py - Command-line interface and application orchestration
  • src/web_api.py - FastAPI web interface with REST endpoints
  • src/water_scraper_v3.py - Core data collection and processing engine

Data Layer

  • src/database_adapters.py - Multi-database support (SQLite, MySQL, PostgreSQL, InfluxDB, VictoriaMetrics)
  • src/models.py - Pydantic data models and type definitions
  • src/validators.py - Data validation and sanitization

Infrastructure Layer

  • src/config.py - Configuration management with environment variable support
  • src/logging_config.py - Structured logging with rotation and colors
  • src/metrics.py - Application metrics collection (counters, gauges, histograms)
  • src/health_check.py - System health monitoring and status checks

Utility Layer

  • src/exceptions.py - Custom exception hierarchy
  • src/rate_limiter.py - API rate limiting and request tracking

🌐 Web API Structure

Endpoints Organization

/                           # Dashboard homepage
├── /health                 # System health status
├── /metrics               # Application metrics
├── /config                # Configuration (masked)
├── /stations              # Station management
│   ├── GET /              # List all stations
│   ├── POST /             # Create new station
│   ├── GET /{id}          # Get specific station
│   ├── PUT /{id}          # Update station
│   └── DELETE /{id}       # Delete station
├── /measurements          # Data access
│   ├── /latest            # Latest measurements
│   └── /station/{code}    # Station-specific data
└── /scraping              # Data collection control
    ├── /trigger           # Manual data collection
    └── /status            # Scraping status

API Models

  • Request Models: Station creation/update, query parameters
  • Response Models: Station info, measurements, health status
  • Error Models: Standardized error responses

🗄️ Database Architecture

Supported Databases

  1. SQLite - Local development and testing
  2. MySQL - Traditional relational database
  3. PostgreSQL - Advanced relational with TimescaleDB support
  4. InfluxDB - Purpose-built time-series database
  5. VictoriaMetrics - High-performance metrics storage

Schema Design

-- Stations table
stations (
    id INTEGER PRIMARY KEY,
    station_code VARCHAR(10) UNIQUE,
    thai_name VARCHAR(255),
    english_name VARCHAR(255),
    latitude DECIMAL(10,8),
    longitude DECIMAL(11,8),
    geohash VARCHAR(20),
    status VARCHAR(20),
    created_at TIMESTAMP,
    updated_at TIMESTAMP
)

-- Measurements table
water_measurements (
    id BIGINT PRIMARY KEY,
    timestamp DATETIME,
    station_id INTEGER,
    water_level DECIMAL(10,3),
    discharge DECIMAL(10,2),
    discharge_percent DECIMAL(5,2),
    status VARCHAR(20),
    created_at TIMESTAMP,
    FOREIGN KEY (station_id) REFERENCES stations(id),
    UNIQUE(timestamp, station_id)
)

🐳 Docker Architecture

Multi-Stage Build

  1. Builder Stage - Compile dependencies and build artifacts
  2. Production Stage - Minimal runtime environment

Service Composition

  • ping-river-monitor - Data collection service
  • ping-river-api - Web API service
  • victoriametrics - Time-series database
  • grafana - Visualization dashboard

📊 Monitoring Architecture

Metrics Collection

  • Counters - API requests, database operations, scraping cycles
  • Gauges - Current values, connection status, resource usage
  • Histograms - Response times, processing durations

Health Checks

  • Database Health - Connection status, data freshness
  • API Health - External API availability, response times
  • System Health - Memory usage, disk space, CPU load

Logging Levels

  • DEBUG - Detailed execution information
  • INFO - General operational messages
  • WARNING - Potential issues and recoverable errors
  • ERROR - Serious problems requiring attention
  • CRITICAL - System-threatening issues

🔧 Configuration Management

Environment Variables

# Database
DB_TYPE=victoriametrics
VM_HOST=localhost
VM_PORT=8428

# Application
SCRAPING_INTERVAL_HOURS=1
LOG_LEVEL=INFO
DATA_RETENTION_DAYS=365

# Security
SECRET_KEY=your-secret-key
API_KEY=your-api-key

Configuration Hierarchy

  1. Environment variables (highest priority)
  2. .env file
  3. Default values in config.py (lowest priority)

🧪 Testing Architecture

Test Categories

  • Unit Tests - Individual component testing
  • Integration Tests - System component interaction
  • API Tests - Endpoint functionality and responses
  • Performance Tests - Load and stress testing

Test Data

  • Mock Data - Simulated API responses
  • Test Database - Isolated test environment
  • Fixtures - Reusable test data sets

📦 Deployment Architecture

Development

python run.py --web-api    # Local development server

Production

docker-compose up -d       # Full stack deployment

CI/CD Pipeline

  1. Test Stage - Run all tests and quality checks
  2. Build Stage - Create Docker images
  3. Deploy Stage - Deploy to staging/production
  4. Health Check - Verify deployment success

🔒 Security Architecture

Input Validation

  • Pydantic models for API requests
  • Data range validation for measurements
  • SQL injection prevention through ORM

Authentication (Future)

  • API key authentication
  • JWT token support
  • Role-based access control

Data Protection

  • Environment variable configuration
  • Sensitive data masking in logs
  • HTTPS support for production

📈 Performance Architecture

Optimization Strategies

  • Database connection pooling
  • Query optimization and indexing
  • Response caching for static data
  • Async processing for I/O operations

Scalability Considerations

  • Horizontal scaling with load balancers
  • Database read replicas
  • Microservice architecture readiness
  • Container orchestration support

🔄 Data Flow Architecture

Collection Flow

External API → Rate Limiter → Data Validator → Database Adapter → Database

API Flow

HTTP Request → FastAPI → Business Logic → Database Adapter → HTTP Response

Monitoring Flow

Application Events → Metrics Collector → Health Checks → Monitoring Dashboard

This architecture provides a solid foundation for a production-ready water monitoring system with excellent maintainability, scalability, and observability.