Some checks failed
Security & Dependency Updates / Dependency Security Scan (push) Successful in 29s
Security & Dependency Updates / Docker Security Scan (push) Failing after 53s
Security & Dependency Updates / License Compliance (push) Successful in 13s
Security & Dependency Updates / Check for Dependency Updates (push) Successful in 19s
Security & Dependency Updates / Code Quality Metrics (push) Successful in 11s
Security & Dependency Updates / Security Summary (push) Successful in 7s
Features: - Real-time water level monitoring for Ping River Basin (16 stations) - Coverage from Chiang Dao to Nakhon Sawan in Northern Thailand - FastAPI web interface with interactive dashboard and station management - Multi-database support (SQLite, MySQL, PostgreSQL, InfluxDB, VictoriaMetrics) - Comprehensive monitoring with health checks and metrics collection - Docker deployment with Grafana integration - Production-ready architecture with enterprise-grade observability CI/CD & Automation: - Complete Gitea Actions workflows for CI/CD, security, and releases - Multi-Python version testing (3.9-3.12) - Multi-architecture Docker builds (amd64, arm64) - Daily security scanning and dependency monitoring - Automated documentation generation - Performance testing and validation Production Ready: - Type safety with Pydantic models and comprehensive type hints - Data validation layer with range checking and error handling - Rate limiting and request tracking for API protection - Enhanced logging with rotation, colors, and performance metrics - Station management API for dynamic CRUD operations - Comprehensive documentation and deployment guides Technical Stack: - Python 3.9+ with FastAPI and Pydantic - Multi-database architecture with adapter pattern - Docker containerization with multi-stage builds - Grafana dashboards for visualization - Gitea Actions for CI/CD automation - Enterprise monitoring and alerting Ready for deployment to B4L infrastructure!
272 lines
9.6 KiB
Markdown
272 lines
9.6 KiB
Markdown
# 🏗️ Project Structure - Northern Thailand Ping River Monitor
|
|
|
|
## 📁 Directory Layout
|
|
|
|
```
|
|
Northern-Thailand-Ping-River-Monitor/
|
|
├── 📁 src/ # Main application source code
|
|
│ ├── __init__.py # Package initialization
|
|
│ ├── main.py # CLI entry point and main application
|
|
│ ├── water_scraper_v3.py # Core data collection engine
|
|
│ ├── web_api.py # FastAPI web interface
|
|
│ ├── config.py # Configuration management
|
|
│ ├── database_adapters.py # Database abstraction layer
|
|
│ ├── models.py # Data models and type definitions
|
|
│ ├── exceptions.py # Custom exception classes
|
|
│ ├── validators.py # Data validation layer
|
|
│ ├── metrics.py # Metrics collection system
|
|
│ ├── health_check.py # Health monitoring system
|
|
│ ├── rate_limiter.py # Rate limiting and request tracking
|
|
│ └── logging_config.py # Enhanced logging configuration
|
|
├── 📁 docs/ # Documentation files
|
|
│ ├── STATION_MANAGEMENT_GUIDE.md # Station management documentation
|
|
│ ├── ENHANCEMENT_SUMMARY.md # Feature enhancement summary
|
|
│ └── PROJECT_STRUCTURE.md # This file
|
|
├── 📁 scripts/ # Utility scripts
|
|
│ └── migrate_geolocation.py # Database migration script
|
|
├── 📁 grafana/ # Grafana configuration
|
|
│ ├── dashboards/ # Dashboard definitions
|
|
│ └── provisioning/ # Grafana provisioning config
|
|
├── 📁 tests/ # Test files
|
|
│ ├── test_integration.py # Integration test suite
|
|
│ ├── test_station_management.py # Station management tests
|
|
│ └── test_api.py # API endpoint tests
|
|
├── 📄 run.py # Simple startup script
|
|
├── 📄 requirements.txt # Production dependencies
|
|
├── 📄 requirements-dev.txt # Development dependencies
|
|
├── 📄 setup.py # Package installation script
|
|
├── 📄 Dockerfile # Docker container definition
|
|
├── 📄 docker-compose.victoriametrics.yml # Complete stack deployment
|
|
├── 📄 Makefile # Common development tasks
|
|
├── 📄 .env.example # Environment configuration template
|
|
├── 📄 .gitignore # Git ignore patterns
|
|
├── 📄 .gitlab-ci.yml # CI/CD pipeline configuration
|
|
├── 📄 LICENSE # MIT license
|
|
├── 📄 README.md # Main project documentation
|
|
└── 📄 CONTRIBUTING.md # Contribution guidelines
|
|
```
|
|
|
|
## 🔧 Core Components
|
|
|
|
### **Application Layer**
|
|
- **`src/main.py`** - Command-line interface and application orchestration
|
|
- **`src/web_api.py`** - FastAPI web interface with REST endpoints
|
|
- **`src/water_scraper_v3.py`** - Core data collection and processing engine
|
|
|
|
### **Data Layer**
|
|
- **`src/database_adapters.py`** - Multi-database support (SQLite, MySQL, PostgreSQL, InfluxDB, VictoriaMetrics)
|
|
- **`src/models.py`** - Pydantic data models and type definitions
|
|
- **`src/validators.py`** - Data validation and sanitization
|
|
|
|
### **Infrastructure Layer**
|
|
- **`src/config.py`** - Configuration management with environment variable support
|
|
- **`src/logging_config.py`** - Structured logging with rotation and colors
|
|
- **`src/metrics.py`** - Application metrics collection (counters, gauges, histograms)
|
|
- **`src/health_check.py`** - System health monitoring and status checks
|
|
|
|
### **Utility Layer**
|
|
- **`src/exceptions.py`** - Custom exception hierarchy
|
|
- **`src/rate_limiter.py`** - API rate limiting and request tracking
|
|
|
|
## 🌐 Web API Structure
|
|
|
|
### **Endpoints Organization**
|
|
```
|
|
/ # Dashboard homepage
|
|
├── /health # System health status
|
|
├── /metrics # Application metrics
|
|
├── /config # Configuration (masked)
|
|
├── /stations # Station management
|
|
│ ├── GET / # List all stations
|
|
│ ├── POST / # Create new station
|
|
│ ├── GET /{id} # Get specific station
|
|
│ ├── PUT /{id} # Update station
|
|
│ └── DELETE /{id} # Delete station
|
|
├── /measurements # Data access
|
|
│ ├── /latest # Latest measurements
|
|
│ └── /station/{code} # Station-specific data
|
|
└── /scraping # Data collection control
|
|
├── /trigger # Manual data collection
|
|
└── /status # Scraping status
|
|
```
|
|
|
|
### **API Models**
|
|
- **Request Models**: Station creation/update, query parameters
|
|
- **Response Models**: Station info, measurements, health status
|
|
- **Error Models**: Standardized error responses
|
|
|
|
## 🗄️ Database Architecture
|
|
|
|
### **Supported Databases**
|
|
1. **SQLite** - Local development and testing
|
|
2. **MySQL** - Traditional relational database
|
|
3. **PostgreSQL** - Advanced relational with TimescaleDB support
|
|
4. **InfluxDB** - Purpose-built time-series database
|
|
5. **VictoriaMetrics** - High-performance metrics storage
|
|
|
|
### **Schema Design**
|
|
```sql
|
|
-- Stations table
|
|
stations (
|
|
id INTEGER PRIMARY KEY,
|
|
station_code VARCHAR(10) UNIQUE,
|
|
thai_name VARCHAR(255),
|
|
english_name VARCHAR(255),
|
|
latitude DECIMAL(10,8),
|
|
longitude DECIMAL(11,8),
|
|
geohash VARCHAR(20),
|
|
status VARCHAR(20),
|
|
created_at TIMESTAMP,
|
|
updated_at TIMESTAMP
|
|
)
|
|
|
|
-- Measurements table
|
|
water_measurements (
|
|
id BIGINT PRIMARY KEY,
|
|
timestamp DATETIME,
|
|
station_id INTEGER,
|
|
water_level DECIMAL(10,3),
|
|
discharge DECIMAL(10,2),
|
|
discharge_percent DECIMAL(5,2),
|
|
status VARCHAR(20),
|
|
created_at TIMESTAMP,
|
|
FOREIGN KEY (station_id) REFERENCES stations(id),
|
|
UNIQUE(timestamp, station_id)
|
|
)
|
|
```
|
|
|
|
## 🐳 Docker Architecture
|
|
|
|
### **Multi-Stage Build**
|
|
1. **Builder Stage** - Compile dependencies and build artifacts
|
|
2. **Production Stage** - Minimal runtime environment
|
|
|
|
### **Service Composition**
|
|
- **ping-river-monitor** - Data collection service
|
|
- **ping-river-api** - Web API service
|
|
- **victoriametrics** - Time-series database
|
|
- **grafana** - Visualization dashboard
|
|
|
|
## 📊 Monitoring Architecture
|
|
|
|
### **Metrics Collection**
|
|
- **Counters** - API requests, database operations, scraping cycles
|
|
- **Gauges** - Current values, connection status, resource usage
|
|
- **Histograms** - Response times, processing durations
|
|
|
|
### **Health Checks**
|
|
- **Database Health** - Connection status, data freshness
|
|
- **API Health** - External API availability, response times
|
|
- **System Health** - Memory usage, disk space, CPU load
|
|
|
|
### **Logging Levels**
|
|
- **DEBUG** - Detailed execution information
|
|
- **INFO** - General operational messages
|
|
- **WARNING** - Potential issues and recoverable errors
|
|
- **ERROR** - Serious problems requiring attention
|
|
- **CRITICAL** - System-threatening issues
|
|
|
|
## 🔧 Configuration Management
|
|
|
|
### **Environment Variables**
|
|
```bash
|
|
# Database
|
|
DB_TYPE=victoriametrics
|
|
VM_HOST=localhost
|
|
VM_PORT=8428
|
|
|
|
# Application
|
|
SCRAPING_INTERVAL_HOURS=1
|
|
LOG_LEVEL=INFO
|
|
DATA_RETENTION_DAYS=365
|
|
|
|
# Security
|
|
SECRET_KEY=your-secret-key
|
|
API_KEY=your-api-key
|
|
```
|
|
|
|
### **Configuration Hierarchy**
|
|
1. Environment variables (highest priority)
|
|
2. .env file
|
|
3. Default values in config.py (lowest priority)
|
|
|
|
## 🧪 Testing Architecture
|
|
|
|
### **Test Categories**
|
|
- **Unit Tests** - Individual component testing
|
|
- **Integration Tests** - System component interaction
|
|
- **API Tests** - Endpoint functionality and responses
|
|
- **Performance Tests** - Load and stress testing
|
|
|
|
### **Test Data**
|
|
- **Mock Data** - Simulated API responses
|
|
- **Test Database** - Isolated test environment
|
|
- **Fixtures** - Reusable test data sets
|
|
|
|
## 📦 Deployment Architecture
|
|
|
|
### **Development**
|
|
```bash
|
|
python run.py --web-api # Local development server
|
|
```
|
|
|
|
### **Production**
|
|
```bash
|
|
docker-compose up -d # Full stack deployment
|
|
```
|
|
|
|
### **CI/CD Pipeline**
|
|
1. **Test Stage** - Run all tests and quality checks
|
|
2. **Build Stage** - Create Docker images
|
|
3. **Deploy Stage** - Deploy to staging/production
|
|
4. **Health Check** - Verify deployment success
|
|
|
|
## 🔒 Security Architecture
|
|
|
|
### **Input Validation**
|
|
- Pydantic models for API requests
|
|
- Data range validation for measurements
|
|
- SQL injection prevention through ORM
|
|
|
|
### **Authentication** (Future)
|
|
- API key authentication
|
|
- JWT token support
|
|
- Role-based access control
|
|
|
|
### **Data Protection**
|
|
- Environment variable configuration
|
|
- Sensitive data masking in logs
|
|
- HTTPS support for production
|
|
|
|
## 📈 Performance Architecture
|
|
|
|
### **Optimization Strategies**
|
|
- Database connection pooling
|
|
- Query optimization and indexing
|
|
- Response caching for static data
|
|
- Async processing for I/O operations
|
|
|
|
### **Scalability Considerations**
|
|
- Horizontal scaling with load balancers
|
|
- Database read replicas
|
|
- Microservice architecture readiness
|
|
- Container orchestration support
|
|
|
|
## 🔄 Data Flow Architecture
|
|
|
|
### **Collection Flow**
|
|
```
|
|
External API → Rate Limiter → Data Validator → Database Adapter → Database
|
|
```
|
|
|
|
### **API Flow**
|
|
```
|
|
HTTP Request → FastAPI → Business Logic → Database Adapter → HTTP Response
|
|
```
|
|
|
|
### **Monitoring Flow**
|
|
```
|
|
Application Events → Metrics Collector → Health Checks → Monitoring Dashboard
|
|
```
|
|
|
|
This architecture provides a solid foundation for a production-ready water monitoring system with excellent maintainability, scalability, and observability. |