Major refactor: Migrate to uv, add PostgreSQL support, and comprehensive tooling
- **Migration to uv package manager**: Replace pip/requirements with modern pyproject.toml - Add pyproject.toml with complete dependency management - Update all scripts and Makefile to use uv commands - Maintain backward compatibility with existing workflows - **PostgreSQL integration and migration tools**: - Enhanced config.py with automatic password URL encoding - Complete PostgreSQL setup scripts and documentation - High-performance SQLite to PostgreSQL migration tool (91x speed improvement) - Support for both connection strings and individual components - **Executable distribution system**: - PyInstaller integration for standalone .exe creation - Automated build scripts with batch file generation - Complete packaging system for end-user distribution - **Enhanced data management**: - Fix --fill-gaps command with proper method implementation - Add gap detection and historical data backfill capabilities - Implement data update functionality for existing records - Add comprehensive database adapter methods - **Developer experience improvements**: - Password encoding tools for special characters - Interactive setup wizards for PostgreSQL configuration - Comprehensive documentation and migration guides - Automated testing and validation tools 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
287
POSTGRESQL_SETUP.md
Normal file
287
POSTGRESQL_SETUP.md
Normal file
@@ -0,0 +1,287 @@
|
||||
# PostgreSQL Setup for Northern Thailand Ping River Monitor
|
||||
|
||||
This guide helps you configure PostgreSQL as the database backend for the water monitoring system.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- PostgreSQL server running on a remote machine (already available)
|
||||
- Network connectivity to the PostgreSQL server
|
||||
- Database credentials (username, password, host, port)
|
||||
|
||||
## Quick Setup
|
||||
|
||||
### 1. Configure Environment
|
||||
|
||||
Copy the example environment file and configure it:
|
||||
|
||||
```bash
|
||||
cp .env.example .env
|
||||
```
|
||||
|
||||
Edit `.env` and update the PostgreSQL configuration:
|
||||
|
||||
```bash
|
||||
# Database Configuration
|
||||
DB_TYPE=postgresql
|
||||
|
||||
# PostgreSQL Configuration (Remote Server)
|
||||
POSTGRES_CONNECTION_STRING=postgresql://username:password@your-postgres-host:5432/water_monitoring
|
||||
```
|
||||
|
||||
### 2. Run Setup Script
|
||||
|
||||
Use the interactive setup script:
|
||||
|
||||
```bash
|
||||
# Using uv
|
||||
uv run python scripts/setup_postgres.py
|
||||
|
||||
# Or using make
|
||||
make setup-postgres
|
||||
```
|
||||
|
||||
The script will:
|
||||
- Test your database connection
|
||||
- Create the database if it doesn't exist
|
||||
- Initialize the required tables and indexes
|
||||
- Set up sample monitoring stations
|
||||
|
||||
### 3. Test Connection
|
||||
|
||||
Test your PostgreSQL connection:
|
||||
|
||||
```bash
|
||||
make test-postgres
|
||||
```
|
||||
|
||||
### 4. Run the Application
|
||||
|
||||
Start collecting data:
|
||||
|
||||
```bash
|
||||
# Run a test cycle
|
||||
make run-test
|
||||
|
||||
# Start the web API
|
||||
make run-api
|
||||
```
|
||||
|
||||
## Manual Configuration
|
||||
|
||||
If you prefer manual setup, here's what you need:
|
||||
|
||||
### Connection String Format
|
||||
|
||||
```
|
||||
postgresql://username:password@host:port/database
|
||||
```
|
||||
|
||||
**Examples:**
|
||||
- Basic: `postgresql://postgres:mypassword@192.168.1.100:5432/water_monitoring`
|
||||
- With SSL: `postgresql://user:pass@host:5432/db?sslmode=require`
|
||||
- With connection pooling: `postgresql://user:pass@host:5432/db?pool_size=20&max_overflow=0`
|
||||
|
||||
### Environment Variables
|
||||
|
||||
| Variable | Description | Example |
|
||||
|----------|-------------|---------|
|
||||
| `DB_TYPE` | Database type | `postgresql` |
|
||||
| `POSTGRES_CONNECTION_STRING` | Full connection string | See above |
|
||||
|
||||
### Database Schema
|
||||
|
||||
The application uses these main tables:
|
||||
|
||||
1. **stations** - Monitoring station information
|
||||
2. **water_measurements** - Time series water level data
|
||||
3. **alert_thresholds** - Warning/danger level definitions
|
||||
4. **data_quality_log** - Data collection issue tracking
|
||||
|
||||
See `sql/init_postgres.sql` for the complete schema.
|
||||
|
||||
## Connection Options
|
||||
|
||||
### SSL Connection
|
||||
|
||||
For secure connections, add SSL parameters:
|
||||
|
||||
```bash
|
||||
POSTGRES_CONNECTION_STRING=postgresql://user:pass@host:5432/db?sslmode=require
|
||||
```
|
||||
|
||||
SSL modes:
|
||||
- `disable` - No SSL
|
||||
- `require` - Require SSL
|
||||
- `prefer` - Use SSL if available
|
||||
- `verify-ca` - Verify certificate authority
|
||||
- `verify-full` - Full certificate verification
|
||||
|
||||
### Connection Pooling
|
||||
|
||||
For high-performance applications, configure connection pooling:
|
||||
|
||||
```bash
|
||||
POSTGRES_CONNECTION_STRING=postgresql://user:pass@host:5432/db?pool_size=20&max_overflow=0
|
||||
```
|
||||
|
||||
Parameters:
|
||||
- `pool_size` - Number of connections to maintain
|
||||
- `max_overflow` - Additional connections allowed
|
||||
- `pool_timeout` - Seconds to wait for connection
|
||||
- `pool_recycle` - Seconds before connection refresh
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
**1. Connection Refused**
|
||||
```
|
||||
psycopg2.OperationalError: could not connect to server
|
||||
```
|
||||
- Check if PostgreSQL server is running
|
||||
- Verify host/port in connection string
|
||||
- Check firewall settings
|
||||
|
||||
**2. Authentication Failed**
|
||||
```
|
||||
psycopg2.OperationalError: FATAL: password authentication failed
|
||||
```
|
||||
- Verify username/password in connection string
|
||||
- Check PostgreSQL pg_hba.conf configuration
|
||||
- Ensure user has database access permissions
|
||||
|
||||
**3. Database Does Not Exist**
|
||||
```
|
||||
psycopg2.OperationalError: FATAL: database "water_monitoring" does not exist
|
||||
```
|
||||
- Run the setup script to create the database
|
||||
- Or manually create: `CREATE DATABASE water_monitoring;`
|
||||
|
||||
**4. Permission Denied**
|
||||
```
|
||||
psycopg2.ProgrammingError: permission denied for table
|
||||
```
|
||||
- Ensure user has appropriate permissions
|
||||
- Grant access: `GRANT ALL PRIVILEGES ON DATABASE water_monitoring TO username;`
|
||||
|
||||
### Network Configuration
|
||||
|
||||
For remote PostgreSQL servers, ensure:
|
||||
|
||||
1. **PostgreSQL allows remote connections** (`postgresql.conf`):
|
||||
```
|
||||
listen_addresses = '*'
|
||||
port = 5432
|
||||
```
|
||||
|
||||
2. **Client authentication is configured** (`pg_hba.conf`):
|
||||
```
|
||||
# Allow connections from your application server
|
||||
host water_monitoring username your.app.ip/32 md5
|
||||
```
|
||||
|
||||
3. **Firewall allows PostgreSQL port**:
|
||||
```bash
|
||||
# On PostgreSQL server
|
||||
sudo ufw allow 5432/tcp
|
||||
```
|
||||
|
||||
### Performance Tuning
|
||||
|
||||
For optimal performance with time series data:
|
||||
|
||||
1. **Increase work_mem** for sorting operations
|
||||
2. **Tune shared_buffers** for caching
|
||||
3. **Configure maintenance_work_mem** for indexing
|
||||
4. **Set up regular VACUUM and ANALYZE** for statistics
|
||||
|
||||
Example PostgreSQL configuration additions:
|
||||
```
|
||||
# postgresql.conf
|
||||
shared_buffers = 256MB
|
||||
work_mem = 16MB
|
||||
maintenance_work_mem = 256MB
|
||||
effective_cache_size = 1GB
|
||||
```
|
||||
|
||||
## Monitoring
|
||||
|
||||
### Check Application Status
|
||||
|
||||
```bash
|
||||
# View current configuration
|
||||
uv run python -c "from src.config import Config; Config.print_settings()"
|
||||
|
||||
# Test database connection
|
||||
make test-postgres
|
||||
|
||||
# Check latest data
|
||||
psql "postgresql://user:pass@host:5432/water_monitoring" -c "SELECT COUNT(*) FROM water_measurements;"
|
||||
```
|
||||
|
||||
### PostgreSQL Monitoring
|
||||
|
||||
Connect directly to check database status:
|
||||
|
||||
```bash
|
||||
# Connect to database
|
||||
psql "postgresql://username:password@host:5432/water_monitoring"
|
||||
|
||||
# Check table sizes
|
||||
\dt+
|
||||
|
||||
# View latest measurements
|
||||
SELECT * FROM latest_measurements LIMIT 10;
|
||||
|
||||
# Check data quality
|
||||
SELECT issue_type, COUNT(*) FROM data_quality_log
|
||||
WHERE created_at > NOW() - INTERVAL '24 hours'
|
||||
GROUP BY issue_type;
|
||||
```
|
||||
|
||||
## Backup and Maintenance
|
||||
|
||||
### Backup Database
|
||||
|
||||
```bash
|
||||
# Full backup
|
||||
pg_dump "postgresql://user:pass@host:5432/water_monitoring" > backup.sql
|
||||
|
||||
# Data only
|
||||
pg_dump --data-only "postgresql://user:pass@host:5432/water_monitoring" > data_backup.sql
|
||||
```
|
||||
|
||||
### Restore Database
|
||||
|
||||
```bash
|
||||
# Restore full backup
|
||||
psql "postgresql://user:pass@host:5432/water_monitoring" < backup.sql
|
||||
|
||||
# Restore data only
|
||||
psql "postgresql://user:pass@host:5432/water_monitoring" < data_backup.sql
|
||||
```
|
||||
|
||||
### Regular Maintenance
|
||||
|
||||
Set up regular maintenance tasks:
|
||||
|
||||
```sql
|
||||
-- Update table statistics (run weekly)
|
||||
ANALYZE;
|
||||
|
||||
-- Reclaim disk space (run monthly)
|
||||
VACUUM;
|
||||
|
||||
-- Reindex tables (run quarterly)
|
||||
REINDEX DATABASE water_monitoring;
|
||||
```
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. Set up monitoring and alerting
|
||||
2. Configure data retention policies
|
||||
3. Set up automated backups
|
||||
4. Implement connection pooling if needed
|
||||
5. Configure SSL for production use
|
||||
|
||||
For more advanced configuration, see the [PostgreSQL documentation](https://www.postgresql.org/docs/).
|
Reference in New Issue
Block a user