State Persistence and System Monitoring
State Persistence and System Monitoring
Overview
The Pool Controller now includes comprehensive state persistence and system health monitoring to ensure reliable 24/7 operation. ESP32-only since v3.2.0.
State Persistence
What Gets Persisted
All controller states are automatically saved to non-volatile storage (NVS) and restored after reboots or power failures:
Operation Settings
- Operation mode: auto, manual, boost, timer
- Pool maximum temperature: Target pool temperature
- Solar minimum temperature: Minimum solar temperature for activation
- Temperature hysteresis: Temperature difference for control
- Timer settings: Start and end times for timer mode
Relay States
- Pool pump: On/Off state
- Solar pump: On/Off state
How It Works
Uses the Preferences library for persistent storage in NVS (Non-Volatile Storage). Each value is stored with a type-specific key.
Relay states are persisted individually per relay via their own Preferences
namespace named after the relay node-id (e.g., pool-pump, solar-pump),
storing key "switch".
Automatic Restoration
When the controller reboots:
- State Manager loads all persisted values
- Configuration settings from LittleFS config.json can override persisted state
- Last known state is used if no config override exists
This ensures that:
- After a power failure, pumps return to their previous state
- User-configured temperatures and timers are preserved
- Operation mode is maintained across reboots
Example Scenario
User sets:
- Operation mode: auto
- Pool max temp: 28.5°C
- Timer: 10:30 - 17:30
Power failure occurs at 14:00
Controller reboots:
- Loads saved state
- Restores operation mode: auto
- Restores temperatures and timers
- Continues operation seamlesslySystem Health Monitoring
Memory Monitoring
The system continuously monitors free heap memory to prevent crashes from memory exhaustion.
Thresholds
ESP32:
- Low Memory Warning: < 16 KB (16,384 bytes)
- Critical Memory: < 8 KB (8,192 bytes) → Auto-reboot
Behavior
- Every 10 seconds: Memory check performed
- Low memory: Warning logged to serial and MQTT
- Critical memory: Controller automatically reboots to recover
- Minimum tracking: Tracks lowest memory point since boot
Watchdog Timer
Prevents system hangs and ensures recovery from software failures.
ESP32
- Hardware watchdog: 30-second timeout
- Automatic panic: Reboots if watchdog not fed
- Fed in main loop: Every cycle
Health Status API
The SystemMonitor provides methods to check system health:
// Get current free heap
uint32_t heap = SystemMonitor::getFreeHeap();
// Get minimum heap since boot
uint32_t minHeap = SystemMonitor::getMinFreeHeap();
// Check if system is healthy
bool healthy = SystemMonitor::isHealthy();
// Get uptime in seconds
uint32_t uptime = SystemMonitor::getUptimeSeconds();
uint8_t fragmentation = SystemMonitor::getHeapFragmentation();Configuration
Enabling Features
Both state persistence and system monitoring are automatically enabled in version 3.1.0+. No configuration required.
Customizing Thresholds
To customize memory thresholds, modify src/SystemMonitor.hpp:
// Low memory threshold (warning only)
static constexpr uint32_t LOW_MEMORY_THRESHOLD = 8192; // 8 KB
// Critical memory threshold (auto-reboot)
static constexpr uint32_t CRITICAL_MEMORY_THRESHOLD = 4096; // 4 KB
Disabling Auto-Reboot
If you prefer to handle low memory manually (not recommended for 24/7 operation):
Comment out the auto-reboot section in src/SystemMonitor.hpp:
// Critical memory - reboot immediately
if (freeHeap < criticalThreshold) {
Serial.printf("CRITICAL: Free heap %d bytes < %d bytes. Rebooting...\n",
freeHeap, criticalThreshold);
// Serial.flush();
// delay(1000);
// ESP.restart(); // Comment this to disable auto-reboot
}Monitoring and Logs
Serial Output
Normal operation:
✓ State loaded from persistent storage
State persistence and system monitoring initialized
Free heap: 28,456 bytesLow memory warning:
WARNING: Low memory detected. Free heap: 7,892 bytes (min: 7,456)Critical memory (before reboot):
CRITICAL: Free heap 3,842 bytes < 4,096 bytes. Rebooting...MQTT Logs
System status is published via the LoggerNode to MQTT topic:
pool-controller/logExample messages:
"State persistence and system monitoring initialized""WARNING: Low memory detected. Free heap: 7892 bytes (min: 7456)"
Benefits for 24/7 Operation
Reliability
- ✅ Survives power failures
- ✅ Recovers from memory issues automatically
- ✅ Detects and recovers from system hangs
- ✅ No manual intervention needed
User Experience
- ✅ Settings preserved across reboots
- ✅ No reconfiguration after power loss
- ✅ Seamless operation continuity
- ✅ Predictable behavior
Maintenance
- ✅ Memory leak detection
- ✅ Automatic problem recovery
- ✅ Health status monitoring
- ✅ Diagnostic information available
Troubleshooting
States Not Persisting
- Check serial output for “State loaded from persistent storage”
- Verify NVS partition is available
- Check for Preferences errors in logs
Frequent Reboots
If the controller reboots frequently:
Check memory usage: Review logs for low memory warnings
Identify memory leak: Look for pattern in when reboots occur
Reduce memory usage:
- Increase measurement intervals
- Reduce MQTT message frequency
- Disable features if possible
Lower threshold: Temporarily lower critical threshold to prevent reboots
(the controller will still reboot, but you may extend uptime while debugging)
Watchdog Timeouts
If watchdog triggers (ESP32):
- Long-blocking operations: Check for delays or long operations in code
- Increase timeout: Modify timeout in
SystemMonitor::begin() - Feed more frequently: Add
SystemMonitor::feedWatchdog()in long operations
Technical Details
Storage Usage
ESP32 NVS:
- Operation mode: ~10 bytes
- Float values (3): 12 bytes
- Integer values (4): 16 bytes
- Total: ~40 bytes
Performance Impact
- State save: < 10ms (occurs only on changes)
- State load: < 20ms (once at boot)
- Memory check: < 1ms (every 10 seconds)
- Watchdog feed: < 0.1ms (every loop)
Total impact: Negligible (< 0.1% CPU usage)
Future Enhancements
Completed in v3.2.0:
- ✅ Degradation Manager: Central health state (NORMAL → CRITICAL)
- ✅ MQTT Reconnect-Refresh: Full state republish on reconnection
- ✅ NTP Graceful Degradation: Three-stage time degradation (GREEN/YELLOW/RED)
Planned:
- 🔜 Configurable thresholds: MQTT-based threshold configuration
- 🔜 Memory stats: Historical memory usage tracking
- 🔜 Remote reboot: MQTT command to trigger reboot
- 🔜 Health dashboard: Web UI for health monitoring
Version: 3.2.0 Status: Production Ready Platform: ESP32