# ๐Ÿ—๏ธ Project Structure - Northern Thailand Ping River Monitor ## ๐Ÿ“ Directory Layout ``` Northern-Thailand-Ping-River-Monitor/ โ”œโ”€โ”€ ๐Ÿ“ src/ # Main application source code โ”‚ โ”œโ”€โ”€ __init__.py # Package initialization โ”‚ โ”œโ”€โ”€ main.py # CLI entry point and main application โ”‚ โ”œโ”€โ”€ water_scraper_v3.py # Core data collection engine โ”‚ โ”œโ”€โ”€ web_api.py # FastAPI web interface โ”‚ โ”œโ”€โ”€ config.py # Configuration management โ”‚ โ”œโ”€โ”€ database_adapters.py # Database abstraction layer โ”‚ โ”œโ”€โ”€ models.py # Data models and type definitions โ”‚ โ”œโ”€โ”€ exceptions.py # Custom exception classes โ”‚ โ”œโ”€โ”€ validators.py # Data validation layer โ”‚ โ”œโ”€โ”€ metrics.py # Metrics collection system โ”‚ โ”œโ”€โ”€ health_check.py # Health monitoring system โ”‚ โ”œโ”€โ”€ rate_limiter.py # Rate limiting and request tracking โ”‚ โ””โ”€โ”€ logging_config.py # Enhanced logging configuration โ”œโ”€โ”€ ๐Ÿ“ docs/ # Documentation files โ”‚ โ”œโ”€โ”€ STATION_MANAGEMENT_GUIDE.md # Station management documentation โ”‚ โ”œโ”€โ”€ ENHANCEMENT_SUMMARY.md # Feature enhancement summary โ”‚ โ””โ”€โ”€ PROJECT_STRUCTURE.md # This file โ”œโ”€โ”€ ๐Ÿ“ scripts/ # Utility scripts โ”‚ โ””โ”€โ”€ migrate_geolocation.py # Database migration script โ”œโ”€โ”€ ๐Ÿ“ grafana/ # Grafana configuration โ”‚ โ”œโ”€โ”€ dashboards/ # Dashboard definitions โ”‚ โ””โ”€โ”€ provisioning/ # Grafana provisioning config โ”œโ”€โ”€ ๐Ÿ“ tests/ # Test files โ”‚ โ”œโ”€โ”€ test_integration.py # Integration test suite โ”‚ โ”œโ”€โ”€ test_station_management.py # Station management tests โ”‚ โ””โ”€โ”€ test_api.py # API endpoint tests โ”œโ”€โ”€ ๐Ÿ“„ run.py # Simple startup script โ”œโ”€โ”€ ๐Ÿ“„ requirements.txt # Production dependencies โ”œโ”€โ”€ ๐Ÿ“„ requirements-dev.txt # Development dependencies โ”œโ”€โ”€ ๐Ÿ“„ setup.py # Package installation script โ”œโ”€โ”€ ๐Ÿ“„ Dockerfile # Docker container definition โ”œโ”€โ”€ ๐Ÿ“„ docker-compose.victoriametrics.yml # Complete stack deployment โ”œโ”€โ”€ ๐Ÿ“„ Makefile # Common development tasks โ”œโ”€โ”€ ๐Ÿ“„ .env.example # Environment configuration template โ”œโ”€โ”€ ๐Ÿ“„ .gitignore # Git ignore patterns โ”œโ”€โ”€ ๐Ÿ“„ .gitlab-ci.yml # CI/CD pipeline configuration โ”œโ”€โ”€ ๐Ÿ“„ LICENSE # MIT license โ”œโ”€โ”€ ๐Ÿ“„ README.md # Main project documentation โ””โ”€โ”€ ๐Ÿ“„ CONTRIBUTING.md # Contribution guidelines ``` ## ๐Ÿ”ง Core Components ### **Application Layer** - **`src/main.py`** - Command-line interface and application orchestration - **`src/web_api.py`** - FastAPI web interface with REST endpoints - **`src/water_scraper_v3.py`** - Core data collection and processing engine ### **Data Layer** - **`src/database_adapters.py`** - Multi-database support (SQLite, MySQL, PostgreSQL, InfluxDB, VictoriaMetrics) - **`src/models.py`** - Pydantic data models and type definitions - **`src/validators.py`** - Data validation and sanitization ### **Infrastructure Layer** - **`src/config.py`** - Configuration management with environment variable support - **`src/logging_config.py`** - Structured logging with rotation and colors - **`src/metrics.py`** - Application metrics collection (counters, gauges, histograms) - **`src/health_check.py`** - System health monitoring and status checks ### **Utility Layer** - **`src/exceptions.py`** - Custom exception hierarchy - **`src/rate_limiter.py`** - API rate limiting and request tracking ## ๐ŸŒ Web API Structure ### **Endpoints Organization** ``` / # Dashboard homepage โ”œโ”€โ”€ /health # System health status โ”œโ”€โ”€ /metrics # Application metrics โ”œโ”€โ”€ /config # Configuration (masked) โ”œโ”€โ”€ /stations # Station management โ”‚ โ”œโ”€โ”€ GET / # List all stations โ”‚ โ”œโ”€โ”€ POST / # Create new station โ”‚ โ”œโ”€โ”€ GET /{id} # Get specific station โ”‚ โ”œโ”€โ”€ PUT /{id} # Update station โ”‚ โ””โ”€โ”€ DELETE /{id} # Delete station โ”œโ”€โ”€ /measurements # Data access โ”‚ โ”œโ”€โ”€ /latest # Latest measurements โ”‚ โ””โ”€โ”€ /station/{code} # Station-specific data โ””โ”€โ”€ /scraping # Data collection control โ”œโ”€โ”€ /trigger # Manual data collection โ””โ”€โ”€ /status # Scraping status ``` ### **API Models** - **Request Models**: Station creation/update, query parameters - **Response Models**: Station info, measurements, health status - **Error Models**: Standardized error responses ## ๐Ÿ—„๏ธ Database Architecture ### **Supported Databases** 1. **SQLite** - Local development and testing 2. **MySQL** - Traditional relational database 3. **PostgreSQL** - Advanced relational with TimescaleDB support 4. **InfluxDB** - Purpose-built time-series database 5. **VictoriaMetrics** - High-performance metrics storage ### **Schema Design** ```sql -- Stations table stations ( id INTEGER PRIMARY KEY, station_code VARCHAR(10) UNIQUE, thai_name VARCHAR(255), english_name VARCHAR(255), latitude DECIMAL(10,8), longitude DECIMAL(11,8), geohash VARCHAR(20), status VARCHAR(20), created_at TIMESTAMP, updated_at TIMESTAMP ) -- Measurements table water_measurements ( id BIGINT PRIMARY KEY, timestamp DATETIME, station_id INTEGER, water_level DECIMAL(10,3), discharge DECIMAL(10,2), discharge_percent DECIMAL(5,2), status VARCHAR(20), created_at TIMESTAMP, FOREIGN KEY (station_id) REFERENCES stations(id), UNIQUE(timestamp, station_id) ) ``` ## ๐Ÿณ Docker Architecture ### **Multi-Stage Build** 1. **Builder Stage** - Compile dependencies and build artifacts 2. **Production Stage** - Minimal runtime environment ### **Service Composition** - **ping-river-monitor** - Data collection service - **ping-river-api** - Web API service - **victoriametrics** - Time-series database - **grafana** - Visualization dashboard ## ๐Ÿ“Š Monitoring Architecture ### **Metrics Collection** - **Counters** - API requests, database operations, scraping cycles - **Gauges** - Current values, connection status, resource usage - **Histograms** - Response times, processing durations ### **Health Checks** - **Database Health** - Connection status, data freshness - **API Health** - External API availability, response times - **System Health** - Memory usage, disk space, CPU load ### **Logging Levels** - **DEBUG** - Detailed execution information - **INFO** - General operational messages - **WARNING** - Potential issues and recoverable errors - **ERROR** - Serious problems requiring attention - **CRITICAL** - System-threatening issues ## ๐Ÿ”ง Configuration Management ### **Environment Variables** ```bash # Database DB_TYPE=victoriametrics VM_HOST=localhost VM_PORT=8428 # Application SCRAPING_INTERVAL_HOURS=1 LOG_LEVEL=INFO DATA_RETENTION_DAYS=365 # Security SECRET_KEY=your-secret-key API_KEY=your-api-key ``` ### **Configuration Hierarchy** 1. Environment variables (highest priority) 2. .env file 3. Default values in config.py (lowest priority) ## ๐Ÿงช Testing Architecture ### **Test Categories** - **Unit Tests** - Individual component testing - **Integration Tests** - System component interaction - **API Tests** - Endpoint functionality and responses - **Performance Tests** - Load and stress testing ### **Test Data** - **Mock Data** - Simulated API responses - **Test Database** - Isolated test environment - **Fixtures** - Reusable test data sets ## ๐Ÿ“ฆ Deployment Architecture ### **Development** ```bash python run.py --web-api # Local development server ``` ### **Production** ```bash docker-compose up -d # Full stack deployment ``` ### **CI/CD Pipeline** 1. **Test Stage** - Run all tests and quality checks 2. **Build Stage** - Create Docker images 3. **Deploy Stage** - Deploy to staging/production 4. **Health Check** - Verify deployment success ## ๐Ÿ”’ Security Architecture ### **Input Validation** - Pydantic models for API requests - Data range validation for measurements - SQL injection prevention through ORM ### **Authentication** (Future) - API key authentication - JWT token support - Role-based access control ### **Data Protection** - Environment variable configuration - Sensitive data masking in logs - HTTPS support for production ## ๐Ÿ“ˆ Performance Architecture ### **Optimization Strategies** - Database connection pooling - Query optimization and indexing - Response caching for static data - Async processing for I/O operations ### **Scalability Considerations** - Horizontal scaling with load balancers - Database read replicas - Microservice architecture readiness - Container orchestration support ## ๐Ÿ”„ Data Flow Architecture ### **Collection Flow** ``` External API โ†’ Rate Limiter โ†’ Data Validator โ†’ Database Adapter โ†’ Database ``` ### **API Flow** ``` HTTP Request โ†’ FastAPI โ†’ Business Logic โ†’ Database Adapter โ†’ HTTP Response ``` ### **Monitoring Flow** ``` Application Events โ†’ Metrics Collector โ†’ Health Checks โ†’ Monitoring Dashboard ``` This architecture provides a solid foundation for a production-ready water monitoring system with excellent maintainability, scalability, and observability.