Files
Northern-Thailand-Ping-Rive…/POSTGRESQL_SETUP.md
grabowski 6c7c128b4d Major refactor: Migrate to uv, add PostgreSQL support, and comprehensive tooling
- **Migration to uv package manager**: Replace pip/requirements with modern pyproject.toml
  - Add pyproject.toml with complete dependency management
  - Update all scripts and Makefile to use uv commands
  - Maintain backward compatibility with existing workflows

- **PostgreSQL integration and migration tools**:
  - Enhanced config.py with automatic password URL encoding
  - Complete PostgreSQL setup scripts and documentation
  - High-performance SQLite to PostgreSQL migration tool (91x speed improvement)
  - Support for both connection strings and individual components

- **Executable distribution system**:
  - PyInstaller integration for standalone .exe creation
  - Automated build scripts with batch file generation
  - Complete packaging system for end-user distribution

- **Enhanced data management**:
  - Fix --fill-gaps command with proper method implementation
  - Add gap detection and historical data backfill capabilities
  - Implement data update functionality for existing records
  - Add comprehensive database adapter methods

- **Developer experience improvements**:
  - Password encoding tools for special characters
  - Interactive setup wizards for PostgreSQL configuration
  - Comprehensive documentation and migration guides
  - Automated testing and validation tools

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-26 15:10:10 +07:00

6.5 KiB

PostgreSQL Setup for Northern Thailand Ping River Monitor

This guide helps you configure PostgreSQL as the database backend for the water monitoring system.

Prerequisites

  • PostgreSQL server running on a remote machine (already available)
  • Network connectivity to the PostgreSQL server
  • Database credentials (username, password, host, port)

Quick Setup

1. Configure Environment

Copy the example environment file and configure it:

cp .env.example .env

Edit .env and update the PostgreSQL configuration:

# Database Configuration
DB_TYPE=postgresql

# PostgreSQL Configuration (Remote Server)
POSTGRES_CONNECTION_STRING=postgresql://username:password@your-postgres-host:5432/water_monitoring

2. Run Setup Script

Use the interactive setup script:

# Using uv
uv run python scripts/setup_postgres.py

# Or using make
make setup-postgres

The script will:

  • Test your database connection
  • Create the database if it doesn't exist
  • Initialize the required tables and indexes
  • Set up sample monitoring stations

3. Test Connection

Test your PostgreSQL connection:

make test-postgres

4. Run the Application

Start collecting data:

# Run a test cycle
make run-test

# Start the web API
make run-api

Manual Configuration

If you prefer manual setup, here's what you need:

Connection String Format

postgresql://username:password@host:port/database

Examples:

  • Basic: postgresql://postgres:mypassword@192.168.1.100:5432/water_monitoring
  • With SSL: postgresql://user:pass@host:5432/db?sslmode=require
  • With connection pooling: postgresql://user:pass@host:5432/db?pool_size=20&max_overflow=0

Environment Variables

Variable Description Example
DB_TYPE Database type postgresql
POSTGRES_CONNECTION_STRING Full connection string See above

Database Schema

The application uses these main tables:

  1. stations - Monitoring station information
  2. water_measurements - Time series water level data
  3. alert_thresholds - Warning/danger level definitions
  4. data_quality_log - Data collection issue tracking

See sql/init_postgres.sql for the complete schema.

Connection Options

SSL Connection

For secure connections, add SSL parameters:

POSTGRES_CONNECTION_STRING=postgresql://user:pass@host:5432/db?sslmode=require

SSL modes:

  • disable - No SSL
  • require - Require SSL
  • prefer - Use SSL if available
  • verify-ca - Verify certificate authority
  • verify-full - Full certificate verification

Connection Pooling

For high-performance applications, configure connection pooling:

POSTGRES_CONNECTION_STRING=postgresql://user:pass@host:5432/db?pool_size=20&max_overflow=0

Parameters:

  • pool_size - Number of connections to maintain
  • max_overflow - Additional connections allowed
  • pool_timeout - Seconds to wait for connection
  • pool_recycle - Seconds before connection refresh

Troubleshooting

Common Issues

1. Connection Refused

psycopg2.OperationalError: could not connect to server
  • Check if PostgreSQL server is running
  • Verify host/port in connection string
  • Check firewall settings

2. Authentication Failed

psycopg2.OperationalError: FATAL: password authentication failed
  • Verify username/password in connection string
  • Check PostgreSQL pg_hba.conf configuration
  • Ensure user has database access permissions

3. Database Does Not Exist

psycopg2.OperationalError: FATAL: database "water_monitoring" does not exist
  • Run the setup script to create the database
  • Or manually create: CREATE DATABASE water_monitoring;

4. Permission Denied

psycopg2.ProgrammingError: permission denied for table
  • Ensure user has appropriate permissions
  • Grant access: GRANT ALL PRIVILEGES ON DATABASE water_monitoring TO username;

Network Configuration

For remote PostgreSQL servers, ensure:

  1. PostgreSQL allows remote connections (postgresql.conf):

    listen_addresses = '*'
    port = 5432
    
  2. Client authentication is configured (pg_hba.conf):

    # Allow connections from your application server
    host    water_monitoring    username    your.app.ip/32    md5
    
  3. Firewall allows PostgreSQL port:

    # On PostgreSQL server
    sudo ufw allow 5432/tcp
    

Performance Tuning

For optimal performance with time series data:

  1. Increase work_mem for sorting operations
  2. Tune shared_buffers for caching
  3. Configure maintenance_work_mem for indexing
  4. Set up regular VACUUM and ANALYZE for statistics

Example PostgreSQL configuration additions:

# postgresql.conf
shared_buffers = 256MB
work_mem = 16MB
maintenance_work_mem = 256MB
effective_cache_size = 1GB

Monitoring

Check Application Status

# View current configuration
uv run python -c "from src.config import Config; Config.print_settings()"

# Test database connection
make test-postgres

# Check latest data
psql "postgresql://user:pass@host:5432/water_monitoring" -c "SELECT COUNT(*) FROM water_measurements;"

PostgreSQL Monitoring

Connect directly to check database status:

# Connect to database
psql "postgresql://username:password@host:5432/water_monitoring"

# Check table sizes
\dt+

# View latest measurements
SELECT * FROM latest_measurements LIMIT 10;

# Check data quality
SELECT issue_type, COUNT(*) FROM data_quality_log
WHERE created_at > NOW() - INTERVAL '24 hours'
GROUP BY issue_type;

Backup and Maintenance

Backup Database

# Full backup
pg_dump "postgresql://user:pass@host:5432/water_monitoring" > backup.sql

# Data only
pg_dump --data-only "postgresql://user:pass@host:5432/water_monitoring" > data_backup.sql

Restore Database

# Restore full backup
psql "postgresql://user:pass@host:5432/water_monitoring" < backup.sql

# Restore data only
psql "postgresql://user:pass@host:5432/water_monitoring" < data_backup.sql

Regular Maintenance

Set up regular maintenance tasks:

-- Update table statistics (run weekly)
ANALYZE;

-- Reclaim disk space (run monthly)
VACUUM;

-- Reindex tables (run quarterly)
REINDEX DATABASE water_monitoring;

Next Steps

  1. Set up monitoring and alerting
  2. Configure data retention policies
  3. Set up automated backups
  4. Implement connection pooling if needed
  5. Configure SSL for production use

For more advanced configuration, see the PostgreSQL documentation.