Implement strict freshness detection without grace periods
Some checks failed
CI/CD Pipeline - Northern Thailand Ping River Monitor / Test Suite (3.10) (push) Has been cancelled
CI/CD Pipeline - Northern Thailand Ping River Monitor / Test Suite (3.11) (push) Has been cancelled
CI/CD Pipeline - Northern Thailand Ping River Monitor / Test Suite (3.12) (push) Has been cancelled
CI/CD Pipeline - Northern Thailand Ping River Monitor / Test Suite (3.9) (push) Has been cancelled
CI/CD Pipeline - Northern Thailand Ping River Monitor / Code Quality (push) Has been cancelled
CI/CD Pipeline - Northern Thailand Ping River Monitor / Build Docker Image (push) Has been cancelled
CI/CD Pipeline - Northern Thailand Ping River Monitor / Integration Test with Services (push) Has been cancelled
CI/CD Pipeline - Northern Thailand Ping River Monitor / Deploy to Staging (push) Has been cancelled
CI/CD Pipeline - Northern Thailand Ping River Monitor / Deploy to Production (push) Has been cancelled
CI/CD Pipeline - Northern Thailand Ping River Monitor / Performance Test (push) Has been cancelled
CI/CD Pipeline - Northern Thailand Ping River Monitor / Cleanup (push) Has been cancelled
Security & Dependency Updates / Check for Dependency Updates (push) Has been cancelled
Security & Dependency Updates / Code Quality Metrics (push) Has been cancelled
Security & Dependency Updates / Dependency Security Scan (push) Has been cancelled
Security & Dependency Updates / License Compliance (push) Has been cancelled
Security & Dependency Updates / Security Summary (push) Has been cancelled
Some checks failed
CI/CD Pipeline - Northern Thailand Ping River Monitor / Test Suite (3.10) (push) Has been cancelled
CI/CD Pipeline - Northern Thailand Ping River Monitor / Test Suite (3.11) (push) Has been cancelled
CI/CD Pipeline - Northern Thailand Ping River Monitor / Test Suite (3.12) (push) Has been cancelled
CI/CD Pipeline - Northern Thailand Ping River Monitor / Test Suite (3.9) (push) Has been cancelled
CI/CD Pipeline - Northern Thailand Ping River Monitor / Code Quality (push) Has been cancelled
CI/CD Pipeline - Northern Thailand Ping River Monitor / Build Docker Image (push) Has been cancelled
CI/CD Pipeline - Northern Thailand Ping River Monitor / Integration Test with Services (push) Has been cancelled
CI/CD Pipeline - Northern Thailand Ping River Monitor / Deploy to Staging (push) Has been cancelled
CI/CD Pipeline - Northern Thailand Ping River Monitor / Deploy to Production (push) Has been cancelled
CI/CD Pipeline - Northern Thailand Ping River Monitor / Performance Test (push) Has been cancelled
CI/CD Pipeline - Northern Thailand Ping River Monitor / Cleanup (push) Has been cancelled
Security & Dependency Updates / Check for Dependency Updates (push) Has been cancelled
Security & Dependency Updates / Code Quality Metrics (push) Has been cancelled
Security & Dependency Updates / Dependency Security Scan (push) Has been cancelled
Security & Dependency Updates / License Compliance (push) Has been cancelled
Security & Dependency Updates / Security Summary (push) Has been cancelled
- Remove tolerance windows and grace periods from data freshness checks - Require data from current hour only - no exceptions or fallbacks - If hourly check runs at 21:xx but only has data up to 20:xx, immediately switch to retry mode - Simplify logic: latest_hour >= current_hour for fresh data - Remove complex age calculations and tolerance conditions This ensures the scheduler immediately detects when new hourly data is not yet available and switches to minute-based retries without delay. Behavior: - 21:02 with data up to 21:xx → Fresh (continue hourly) - 21:02 with data up to 20:xx → Stale (immediate retry mode) - No grace periods, no tolerance windows, strict hour-based detection 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -500,7 +500,7 @@ class EnhancedWaterMonitorScraper:
|
||||
return []
|
||||
|
||||
def _check_data_freshness(self, water_data: List[Dict]) -> bool:
|
||||
"""Check if the fetched data contains expected current hour data"""
|
||||
"""Check if the fetched data contains new data for the current hour"""
|
||||
if not water_data:
|
||||
return False
|
||||
|
||||
@@ -520,37 +520,22 @@ class EnhancedWaterMonitorScraper:
|
||||
|
||||
latest_hour = latest_timestamp.hour
|
||||
time_diff = current_time - latest_timestamp
|
||||
hours_old = time_diff.total_seconds() / 3600
|
||||
minutes_old = time_diff.total_seconds() / 60
|
||||
|
||||
logger.info(f"Current time: {current_time.strftime('%H:%M')}, Latest data: {latest_timestamp.strftime('%H:%M')}")
|
||||
logger.info(f"Current hour: {current_hour}, Latest data hour: {latest_hour}, Age: {hours_old:.1f} hours")
|
||||
logger.info(f"Current hour: {current_hour}, Latest data hour: {latest_hour}, Age: {minutes_old:.1f} minutes")
|
||||
|
||||
# Check if we have data for the current hour or the previous hour
|
||||
# If it's 20:00 and we only have data up to 19:xx, that's stale
|
||||
expected_hour = current_hour
|
||||
has_current_hour = latest_hour >= expected_hour
|
||||
# Strict check: we need data from the current hour
|
||||
# If it's 20:xx and we only have data up to 19:xx, that's stale - go to retry mode
|
||||
has_current_hour_data = latest_hour >= current_hour
|
||||
|
||||
# Allow some tolerance: if it's early in the hour (first 10 minutes),
|
||||
# accept data from the previous hour
|
||||
if current_time.minute <= 10 and latest_hour == (current_hour - 1):
|
||||
has_current_hour = True
|
||||
logger.info(f"Early in hour {current_hour}, accepting previous hour {latest_hour} data")
|
||||
|
||||
# Also check that data isn't too old (backup check)
|
||||
not_too_old = hours_old <= 2.0
|
||||
|
||||
is_fresh = has_current_hour and not_too_old
|
||||
|
||||
if not is_fresh:
|
||||
if not has_current_hour:
|
||||
logger.warning(f"Missing current hour data - expected hour {expected_hour}, got {latest_hour}")
|
||||
if not not_too_old:
|
||||
logger.warning(f"Data is too old ({hours_old:.1f} hours)")
|
||||
logger.warning("Data is stale, switching to retry mode")
|
||||
if not has_current_hour_data:
|
||||
logger.warning(f"No new data available - expected hour {current_hour}, got {latest_hour}")
|
||||
logger.warning("Switching to retry mode until new data becomes available")
|
||||
return False
|
||||
else:
|
||||
logger.info(f"Data is fresh - has current/recent hour data")
|
||||
|
||||
return is_fresh
|
||||
logger.info(f"Fresh data available for current hour {current_hour}")
|
||||
return True
|
||||
|
||||
def run_scraping_cycle(self) -> bool:
|
||||
"""Run a complete scraping cycle with freshness check"""
|
||||
|
Reference in New Issue
Block a user