docker-compose syntax~/.docker/cli-plugins/docker compose syntax/health, /stats)deploy.sh):
/home/elfege/UBIQUITI_NVR/ to /home/elfege/0_NVR/device_manager.pyeufy_bridge.py)stream_manager.py)services/unifi_service.py)services/eufy_service.py)stream_proxy.py)static/js/streaming/)templates/streams.html)eufy_bridge.sh - Node.js server startup scripteufy_bridge.py - Python WebSocket clienteufy_bridge_watchdog.py - Health monitoring and auto-restartconfig/config.jsonNOTE: DEPRECATED: found out this model doesn’t have any motor… huge waste of time lol - but keeping these for future Unifi PTZ capable (pricey)
G5-Flex_Motor_Command_Trigger.pyG5-Flex_Motor_Control_Discovery_Script.pyG5-Flex_Motor_Initialization.pyPTZ_Discovery.pyg5flex_ptz_http.py for potential motor controldeploy.sh and container configurations exist but unusedpull_NVR.sh for deployment automation/home/elfege/0_NVR/services/camera_base.py) for vendor-agnostic implementationCameraService base class with standardized methods:
authenticate() - Session management across camera typesget_snapshot() - JPEG image retrievalget_stream_url() - Streaming endpoint provisionptz_move() - PTZ control for capable camerasservices/unifi_service.py): Extracted proven session-based authentication logic from stream_proxy.pyservices/eufy_service.py): WebSocket bridge integration for PTZ control and streamingconfig/cameras.json): Consolidated camera definitions supporting both camera ecosystemsservices/camera_manager.py): Dynamic camera service instantiation based on typeapp.py): Single application serving both camera ecosystemsstream_proxy.py session management into modular services/unifi_service.pyEufyCameraService class using shared bridge processCameraManager loading both camera types from single configuration fileconfig/cameras.json with 6 total cameras (1 UniFi G5-Flex + 5 Eufy T8416 PTZ models)/api/stream/start/<camera_id> for HLS streaming/api/unifi/<camera_id>/stream/mjpeg for MJPEG streaming/api/status endpointmjpeg-stream.js - UniFi MJPEG stream handlinghls-stream.js - Eufy HLS stream managementstream.js - Main hub coordinating both streaming typesstreams.html to handle both camera types in unified grid interfacehttp://192.168.10.17:5000/api/unifi/g5flex_living/stream/mjpegself.process.poll() called on None object when bridge dies during monitoringtraceback, subprocess, and socket modules not imported in watchdogsubprocess, socket, and traceback imports to prevent NameError exceptions_monitor_bridge() before calling process.poll()eufy-security-server processes via pkill_running flag updates.m3u8 files despite bridge failureapp.py with proper subprocess terminationeufy_bridge_watchdog.py: Complete rewrite with proper imports, bounded counters, and zombie cleanupeufy_bridge.py: Added null checks in monitoring thread and proper state managementDeviceManager expecting devices.json structure while attempting to use config/cameras.json formatCameraManager class only used in experimental app_unified_attempt.py, not in active app.py - completely removed from architectureDeviceManager remains generic, using existing services/unifi_service.py rather than redefining camera-specific logicconfig/cameras.json using devices.json compatible structure:
devices section containing all 10 cameras (1 UniFi + 9 EUFY including non-PTZ models)ptz_cameras section for PTZ-capable cameras onlysettings section preserved for bridge configurationDeviceManager to remain vendor-agnostic while adding useful methods from CameraManager:
get_cameras_by_type() - Filter by camera vendorget_unifi_cameras() / get_eufy_cameras() - Vendor-specific filteringis_unifi_camera() / is_eufy_camera() - Type checking methodsget_streaming_cameras() - Cameras with streaming capabilityDeviceManager provides metadata and discovery, actual camera operations handled by existing service classes in services/ directoryDeviceManager - preserved separation of concernsconfig/cameras.json with consistent structureDeviceManagerservices/unifi_service.py session management and streaming logicDeviceManager enhanced with batch operations while maintaining generic design principlesdevice_manager.get_ptz_cameras() instead of all streaming camerasDeviceManager.get_streaming_cameras(), streams interface should now display all 9 streaming cameras (excluding doorbell)devices.json format expected by DeviceManager and cameras.json format attempted in unified approachDeviceManager hardcoded to expect specific structure with devices and ptz_cameras sections, incompatible with cameras section formatDeviceManager default path from "devices.json" to "./config/cameras.json" for unified configuration locationCameraManager class only referenced in unused app_unified_attempt.py experimental file, completely removable from production architectureDeviceManager must remain generic, camera-specific logic belongs in services/ directoryservices/unifi_service.py with proven session management rather than duplicating functionalityauthenticate_all() - Batch authentication operationsget_status_all() - Health monitoring across all camerasget_cameras_by_type() - Type-based filtering (unifi/eufy)DeviceManagerstream_proxy.py implementationDeviceManager with useful batch operations while preserving vendor-agnostic designget_cameras_by_type(), get_unifi_cameras(), get_eufy_cameras(), is_unifi_camera(), is_eufy_camera()get_streaming_cameras() method for cameras with streaming capabilitiesservices/unifi_service.py rather than redefining camera-specific functionalitydevices/ptz_cameras/settings JSON structuredevices.json format structure in config/cameras.json location for single-file managementptz_capable boolean in favor of standardized capabilities array formatstream_type field for all cameras (“hls_transcode” for EUFY, “mjpeg_proxy” for UniFi)credentials object with username/password fields across all camera typesip field to all EUFY cameras extracted from RTSP URLs for unified network informationget_streaming_cameras() method to use capability-based filtering: 'streaming' in device_info.get('capabilities', [])'ptz' in capabilities instead of deprecated ptz_capable booleandevice_manager.get_ptz_cameras() (5 PTZ) + 1 UniFi instead of device_manager.get_streaming_cameras() (should return 9 total)camera_config.get('cameras', {}) to camera_config.get('devices', {}) matching unified structure["streaming", "ptz"] - T8416 models["streaming"] - T8419/T8441 models["streaming"] - G5-Flex with MJPEG proxy["doorbell"] - T8214 with null RTSP excluded from streamingconfig/cameras.json with consistent field structureptz_capable boolean and numeric type codes in favor of capability arrays and string typeseufy-security-server bridge at ws://127.0.0.1:3000 for PTZ control onlyrtsp://username:password@IP/live0)http://192.168.10.17:5000/api/streams/T8416P0023390DE9/playlist.m3u8is_valid_ptz_camera()) instead of streaming capabilitystream_manager.py correctly uses RTSP URLs from camera configuration (camera_info['rtsp']['url'])process.poll() checks but may be stuck in buffering or connection timeout statesis_valid_ptz_camera() with capability-based validation for streaming endpointsis_valid_streaming_camera() method to device managerpull_NVR.sh causing empty tree output
include_patterns_joined_for_tree but used undefined $inc_pat variableapp.py that incorrectly used is_valid_ptz_camera() instead of is_valid_streaming_camera()is_valid_streaming_camera() method to properly validate cameras with streaming capability regardless of PTZ support["streaming"] capability, not just PTZ-capable camerasstimeout not available)reconnect, reconnect_at_eof, reconnect_streamed, and timeout parametersI’ll add a new section to document today’s troubleshooting efforts:I’ve added the afternoon session to project history. Here’s a summary of what we accomplished and where we stand:
app.py to use streaming capability instead of PTZ capabilityThe system works for Blue Iris (sor of…) but the web interface streaming has persistent stability issues. The problem is definitively at the FFmpeg/RTSP layer, not Flask integration.
Your methodical approach of isolating components was exactly right - it eliminated multiple potential causes and pinpointed the real issue. Sometimes the most valuable troubleshooting sessions are the ones that definitively rule out possibilities, even when they don’t achieve the final solution.
/static/streams/T8419P0024110C6A/ instead of intended /streams/ location-c:v copy -c:a copy commands consistently triggered /static/streams/ creation-c:v libx264 initially appeared to avoid the issue-master_pl_name flag as another trigger for unwanted directory creation'stream_dir': self.hls_dir storing wrong directory reference in active streams_start_ffmpeg_process_noaudio method referencing undefined class attributes/static/streams/ directory creation continues despite all fixesThe phantom /static/streams/ directory creation remains an unresolved technical mystery despite comprehensive debugging efforts, though it does not prevent system functionality.
-c:v copy -c:a copy approach consistently failed to create playlists within 30-second timeout/static/streams/ instead of intended /streams/ directory-c:v libx264 -preset ultrafast reliably created streams with 2-4 second latencyprocess.poll() checks while actually hung in RTSP connection timeouts-master_pl_name flag identified as trigger for unwanted /static/streams/ directory creationindependent_segments flag caused stream loading failures, requiring simpler flag approachSimplified FFmpeg Command: Reduced to essential parameters for reliability:
ffmpeg -i rtsp_url -reconnect 1 -c:v libx264 -preset ultrafast -tune zerolatency
-c:a aac -f hls -hls_time 2 -hls_list_size 10 -hls_flags delete_segments+split_by_time
HTTPConnectionPool error with errno 24: Too many open files after several hours of operationget_snapshot() every 500ms, with multiple concurrent streams from Blue Iris + web UIrequests.Session() objects accumulating HTTP connections without proper cleanup, reaching system file descriptor limit (1024 per process)urllib3 adapter with limited connection pools (pool_connections=2, pool_maxsize=5) and connection blockingConnection: close headers and explicit response.close() callsservices/unifi_service_resource_monitor.py for clean separation of concernsservices/app_restart_handler.py for coordinated cleanup of streams, bridge services, and resources/api/status/unifi-monitor for detailed monitoring status and /api/status/unifi-monitor/summary for health checks/api/maintenance/recycle-unifi-sessions endpoint for manual session recycling during troubleshootingcleanup_handler() to include resource monitor shutdown and explicit UniFi camera session cleanup/static/streams/ directory creation was NOT caused by FFmpeg, copy codec behavior, or any application codesync_wsl.sh script running via cron every 4 minutes, synchronizing files across networked machines without --delete flagstatic/streams creationremove /home/elfege/0_NVR/static/streams command to delete directory from all synchronized machines-c:v copy -c:a copy behavior should be updatedsync_wsl.sh behavior and exclusion patterns to prevent similar confusionTechnical Note: This investigation demonstrates the importance of considering system-level factors before deep-diving into application code. The methodical hypothesis testing approach was sound but initially focused too narrowly on application behavior rather than environmental factors.
/api/unifi/<camera_id>/stream/mjpeg created separate generator functions, each calling camera.get_snapshot() independentlyservices/unifi_mjpeg_capture_service.py following stream_manager.py patterns for architectural consistency/api/unifi/<camera_id>/stream/mjpeg to use capture service instead of direct camera calls/api/status/mjpeg-captures for service monitoring and debuggingstream_manager.py, unifi_service.py, and other service componentsRe-onboarding into a previously containerized UniFi G5-Flex camera proxy that serves as a prelude to the unified NVR project. Goal was to run the UniFi camera independently while the main unified NVR system (~/0_NVR) remains unstable with Eufy camera integration.
Problem Encountered:
stream_proxy.py directly on host system/app/logs)Solution Process:
deploy.sh scriptFinal Result:
http://192.168.10.8:8080/g5flex.mjpegSometimes the “return to source” approach (proven, stable container) is more valuable than wrestling with complex, unstable unified systems - especially when new hardware (U Protect) offers better integration paths forward.
User needed to integrate newly installed UCKG2 Plus (192.168.10.3) with existing containerized UniFi G5-Flex proxy to access LL-HLS streams instead of current MJPEG approach. Goal was adding UniFi Protect API as alternative streaming method alongside existing working MJPEG proxy.
Authentication Script Development: Created comprehensive bash script (get_token.sh) for UniFi Protect API authentication with automatic 2FA handling, including:
~/0_UNIFI_NVR/cookies/)2FA Implementation Challenges: Systematic troubleshooting revealed multiple technical issues:
(see /0_UNIFI_NVR/LL-HLS/get_token.sh)
/api/auth/mfa/challenge requestsRoot Cause Analysis: Extended debugging confirmed MFA cookie extraction and formatting worked correctly, but fundamental authentication flow remained blocked. Multiple attempts to resolve curl syntax issues, cookie handling, and endpoint variations failed to achieve successful 2FA challenge completion.
Forum Investigation: Comprehensive research documented in 0_UNIFI_NVR/DOCS/UniFi_Protect_2FA_Authentication.md revealed critical industry context:
Local Account Solution: Research confirmed local admin account creation eliminates 2FA complexity completely:
Implementation Plan: Create local admin account on UCKG2 Plus (192.168.10.3) with disabled remote access, then modify existing authentication scripts to use local credentials instead of cloud account. This approach eliminates the entire 2FA implementation challenge while maintaining security appropriate for local network access.
Project Status: 2FA script development suspended in favor of simpler, more reliable local account approach. Existing containerized G5-Flex proxy remains operational as fallback streaming method.
User needed secure credential storage for the UniFi NVR project, moving away from storing passwords in GitHub repositories. Initial consideration of GitHub’s secrets API revealed it’s write-only, prompting exploration of AWS Secrets Manager as an alternative.
Current credential management issues:
.env files (same fundamental problem as storing passwords directly)Cost analysis confirmed feasibility:
Architecture decisions:
~/.aws/credentials as “root” credentialAWS CLI integration into .bash_utils:
install_aws_cli() function with update handlingAWS_PROFILE=personal for default operationsKey functions updated:
configure_aws_cli() - Personal account setup with installation handlingaws_auth() - Authentication with SSO fallback optionspull_secrets_from_aws() - Fixed hardcoded secret name overridepush_secret_to_aws() - Secret creation/update with authenticationtest_secrets_manager_access() - Permission validationPersonal AWS account setup:
aws configure --profile personalInstallation dependency on dellserver:
unzip package prevented AWS CLI installationsudo apt install unzip -y before running installationAuthentication flow confirmed working:
Threat model analysis confirmed AWS approach is superior:
~/.aws/credentials less risky than scattered encryption keys.env files to AWS Secrets Managerpull_secrets_from_aws() functionAWS_PROFILE=personal in environment configurationStatus: AWS Secrets Manager integration complete and tested. Personal account configured with proper permissions. Ready for production credential migration.
Problem Identified: The list_aws_secrets function was failing with an AccessDeniedException, showing the wrong IAM user (ECRAccess2) was being used instead of the intended “personal” profile.
Root Cause:
aws_auth() function: local profile="${1:personal}" should be local profile="${1:-personal}"~/.aws/config had no actual credential configuration (only region/output settings)ECRAccess2 IAM userDiagnostic Process:
elfege-PowerUserAccess-394153487506, ecr_poweraccess_set-394153487506)[profile personal] entry lacks SSO session configurationStatus: Issue identified but not yet resolved. User needs to either:
Context Change: User removed Blue Iris and wiped the Windows PC. The Dell server will now be the sole NVR system managing all camera types.
Key Discovery: UniFi Protect RTSPS streams work without complex token authentication on the local network.
Working Stream Format:
rtsps://192.168.10.3:7441/{rtspAlias}?enableSrtp
Architecture Decisions:
/proxy/protect/api/bootstrap) only needed for discovering rtspAlias values programmaticallyPlanned Implementation:
# services/unifi_protect_service.py
class UniFiProtectService(CameraService):
"""
Provides RTSPS stream URLs from UniFi Protect
No authentication needed - streams accessible on local network
"""
def authenticate(self) -> bool:
return True # No auth required for RTSPS
def get_stream_url(self) -> str:
rtsp_alias = self.config.get('rtsp_alias')
protect_ip = self.config.get('protect_ip', '192.168.10.3')
return f"rtsps://{protect_ip}:7441/{rtsp_alias}?enableSrtp"
def get_snapshot(self) -> bytes:
# Can extract from RTSPS stream via FFmpeg if needed
pass
Integration Pattern:
# In unified_nvr_server.py
protect_service = UniFiProtectService(camera_config)
stream_manager.start_stream(
camera_id='g5flex',
source_url=protect_service.get_stream_url(),
output_format='ll-hls'
)
Goal: Create unified NVR system in 0_NVR/ directory that handles:
Legacy Code Status:
services/unifi_service.py (MJPEG direct camera access) - Keep with comments noting it’s deprecatedstream_proxy.py - Original G5 Flex MJPEG proxy - May be archivedUniFiProtectService class per the simplified architecture abovestream_manager.py using real Protect streamrtspAlias discovery method (manual config vs. bootstrap API)rtsp_alias, protect_ip)unified_nvr_server.py frameworkrtsps:// protocol natively?enableSrtp enables Secure Real-time Transport ProtocolrtspAlias from bootstrap data (e.g., zQvCrKqH0Yj5aslR).bash_utils - Identified syntax error in aws_auth() function (line 1877)user-api) created on UCKG2 Plus, bypassing 2FA complexity entirelyUniFi-Camera-Credentials), loaded via existing .bash_utils functionsDocker Infrastructure Created:
./config:/app/config:ro - Read-only configuration./streams:/app/streams - HLS segment output./logs:/app/logs - Persistent loggingDeployment Automation Scripts:
.bash_utils, automatic environment variable exportsource .bash_utils → pull_secrets_from_aws → export PROTECT_USERNAME/PASSWORD → docker-compose up (environment variables passed into container)cameras.json JSON Syntax Fix:
"devices" instead of children"devices" objectpython3 -m json.tool config/cameras.json used to identify line 246 syntax errorUniFi Camera Configuration Update:
{
"68d49398005cf203e400043f": {
"type": "unifi",
"name": "G5 Flex",
"protect_host": "192.168.10.3",
"camera_id": "68d49398005cf203e400043f",
"rtsp_alias": "zQvCrKqH0Yj5aslR",
"stream_mode": "rtsps_transcode",
"capabilities": ["streaming"],
"stream_type": "ll_hls"
}
}
From: services/unifi_service.py (Direct Camera Access)
# OLD - Broken after Protect adoption
camera_ip = "192.168.10.104"
login_url = f"http://{camera_ip}/api/1.1/login"
snapshot_url = f"http://{camera_ip}/snap.jpeg"
To: services/unifi_protect_service.py (Protect API Access)
# NEW - Works through Protect console
protect_host = "192.168.10.3"
login_url = f"https://{protect_host}/api/auth/login"
snapshot_url = f"https://{protect_host}/proxy/protect/api/cameras/{camera_id}/snapshot"
Initial Assumptions (INCORRECT):
rtsps://192.168.10.3:7441/{rtsp_alias}?enableSrtprtsp://username:password@host:port/aliasVLC Testing Revealed Truth:
rtsp://192.168.10.3:7447/zQvCrKqH0Yj5aslR?enableSrtp neededArchitecture Simplification:
def get_rtsps_url(self) -> str:
"""
Get RTSP URL for FFmpeg transcoding
Simple format works on local network - no auth, no encryption
"""
return f"rtsp://{self.protect_host}:7447/{self.rtsp_alias}"
Initial Issue: Wrong password being used from AWS secrets due to misconfiguration
UniFi-Camera-Credentials (corrected from initial confusion)PROTECT_USERNAME and PROTECT_SERVER_PASSWORD.bash_utils → pull_secrets_from_aws() → environment export → Docker containerDeployment Workflow:
# Load credentials from AWS
source ~/.bash_utils --no-exec
pull_secrets_from_aws UniFi-Camera-Credentials
export PROTECT_USERNAME
export PROTECT_SERVER_PASSWORD
# Deploy container
./start.sh # Automatically uses exported environment variables
app.py from UniFiCameraService to UniFiProtectService importsip field, new service uses protect_host, camera_id, rtsp_aliasCurrent Blocker: stream_manager.py expects Eufy-style RTSP structure:
# What stream_manager expects (Eufy cameras)
camera_info['rtsp']['url'] # "rtsp://user:pass@ip/live0"
# What UniFi Protect has
camera_info['rtsp_alias'] # ""
camera_info['protect_host'] # "192.168.10.3"
Next Steps:
UniFiProtectService.get_rtsps_url() to return correct RTSP URL formatstream_manager.py to detect UniFi camera type and construct URL accordinglyrtsp://192.168.10.3:7447/{rtsp_alias} → HLS output/api/streams/{camera_id}/playlist.m3u8deploy.sh, start.sh, stop.sh) with AWS integrationKey Finding: UniFi Protect RTSP streams work without authentication on local network:
rtsp://192.168.10.3:7447/{rtsp_alias} (no credentials, no query parameters)Blocker Discovered: FFmpeg cannot parse UniFi Protect’s RTSP stream format
access_realrtsp module with warning “only real/helix rtsp servers supported for now”1. stream_manager.py - UniFi RTSP URL Construction
# Added logic to construct UniFi RTSP URLs differently from Eufy
if stream_type == "ll_hls" and camera_type == "unifi":
rtsp_alias = camera_info.get('rtsp_alias')
protect_host = camera_info.get('protect_host', '192.168.10.3')
protect_port = camera_info.get('protect_port', 7447)
rtsp_url = f"rtsp://{protect_host}:{protect_port}/{rtsp_alias}"
elif camera_type == "eufy":
rtsp_url = camera_info['rtsp']['url']
2. Frontend Template Fixes (templates/streams.html)
data-stream-type="{{ info.stream_type }}" now rendered in DOM{% if info.type == 'unifi' %} to {% if info.stream_type == 'MJPEG' or info.stream_type == 'mjpeg_proxy' %}
static/css/streams.css with proper grid defaults3. JavaScript Refactoring (static/js/streaming/stream.js)
cameraType and streamTypestreamType determines which manager (mjpegManager vs hlsManager)cameraType available for vendor-specific logic (PTZ, etc.)4. Configuration Update (config/cameras.json)
{
"68d49398005cf203e400043f": {
"type": "unifi",
"stream_type": "ll_hls", // Changed from "mjpeg_proxy"
"rtsp_alias": "zQvCrKqH0Yj5aslR",
"protect_host": "192.168.10.3",
"protect_port": "7447"
}
}
Resolved:
Unresolved - Critical Blocker:
Option A: Use Protect’s Native HLS Streams
https://192.168.10.3/proxy/protect/hls/{camera_id}/playlist.m3u8?token={auth_token}Option B: GStreamer Instead of FFmpeg
Option C: Keep G5 Flex on MJPEG
/proxy/protect/api/cameras/{id}/snapshotstream_manager.py - UniFi RTSP URL construction logictemplates/streams.html - Added stream_type attribute, fixed element logicstatic/css/streams.css - New file with extracted stylesstatic/js/streaming/stream.js - Refactored for dual parameter supportconfig/cameras.json - Changed G5 Flex to ll_hls mode (currently non-functional)Critical Discovery: UniFi Protect RTSP streams require different FFmpeg parameters than Eufy cameras
rtsp://192.168.10.3:7447/zmUKsRyrMpDGSThn (no authentication, simple alias)-timeout parameter chosen for reliabilityRoot Cause: FFmpeg 5.1.6 (Debian 12) does not support advanced LL-HLS parameters
-hls_partial_duration, -hls_segment_type, -hls_playlist_type, advanced x264 options-reconnect flag is built-in to modern FFmpeg, explicitly adding it causes crashes-rtsp_transport tcp -timeout 30000000 (30-second timeout critical)-rtsp_transport tcp (no additional flags needed)Problem: FFmpeg processes dying immediately on startup created zombie processes
Solution: Added startup validation with 0.5s delay and process.poll() check before tracking
time.sleep(0.5)
if process.poll() is not None:
raise Exception(f"FFmpeg died immediately with code {process.returncode}")
Finalized Parameters (simple, reliable, works for all camera types):
# UniFi Protect
ffmpeg -rtsp_transport tcp -timeout 30000000 -i rtsp://... \
-c:v libx264 -preset ultrafast -tune zerolatency -c:a aac \
-f hls -hls_time 2 -hls_list_size 10 \
-hls_flags delete_segments+split_by_time \
-hls_segment_filename segment_%03d.ts -y playlist.m3u8
# Eufy Cameras
ffmpeg -rtsp_transport tcp -i rtsp://... \
-c:v libx264 -preset ultrafast -tune zerolatency -c:a aac \
-f hls -hls_time 2 -hls_list_size 10 \
-hls_flags delete_segments+split_by_time \
-hls_segment_filename segment_%03d.ts -y playlist.m3u8
stream_manager.py: Added camera-type detection, dynamic FFmpeg parameter selection, zombie process preventioncameras.json: Updated G5 Flex with correct rtsp_alias (zmUKsRyrMpDGSThn)see: OCT_2025_Architecture_Refactoring_Migration.md
This refactoring transforms the monolithic, tightly-coupled NVR codebase into a clean, modular, testable architecture following SOLID principles.
config/unifi_protect.json - UniFi Protect console settingsconfig/eufy_bridge.json - Eufy bridge and RTSP settingsconfig/reolink.json - Reolink NVR settings (future)config/cameras.json - Cleaned camera configs (no credentials)services/credentials/credential_provider.py - Abstract interfaceservices/credentials/aws_credential_provider.py - AWS implementationservices/camera_repository.py - Data access layerservices/ptz_validator.py - Business logic for PTZstreaming/stream_handler.py - Abstract base classstreaming/handlers/eufy_stream_handler.py - Eufy implementationstreaming/handlers/unifi_stream_handler.py - UniFi implementationstreaming/handlers/reolink_stream_handler.py - Reolink implementationstreaming/stream_manager.py - Orchestrator using Strategy Patternapp.py - Refactored with dependency injectionOCT_2025_Architecture_Refactoring_Migration.md.md - Step-by-step migration instructionsEach camera vendor has its own stream handler implementing a common interface:
handler = handlers[camera_type] # Get appropriate handler
rtsp_url = handler.build_rtsp_url(camera, stream_type=stream_type)
ffmpeg_params = handler.get_ffmpeg_params()
Data access separated from business logic:
camera_repo = CameraRepository('./config')
camera = camera_repo.get_camera(serial)
Services receive dependencies via constructor:
stream_manager = StreamManager(
camera_repo=camera_repo,
credential_provider=credential_provider
)
Each class has one reason to change:
CameraRepository - only changes when data storage changesPTZValidator - only changes when PTZ logic changesEufyStreamHandler - only changes when Eufy streaming changesBefore:
# Edit stream_manager.py (200+ lines)
if camera_type == "eufy":
# ... existing code
elif camera_type == "unifi":
# ... existing code
elif camera_type == "reolink": # Add here
# ... write 50 lines of new code mixed with old
After:
# Create new file: streaming/handlers/reolink_stream_handler.py
class ReolinkStreamHandler(StreamHandler):
def build_rtsp_url(self, camera): ...
def get_ffmpeg_params(self): ...
# Register in stream_manager.__init__ (1 line)
'reolink': ReolinkStreamHandler(credential_provider, reolink_config)
Before:
# Find/replace in 5+ files
username = os.getenv(f'EUFY_CAMERA_{serial}_USERNAME')
# Scattered throughout codebase
After:
# Swap one class in app.py
credential_provider = VaultCredentialProvider() # Changed from AWS
# Everything else works unchanged
Before:
# Must mock entire device_manager + stream_manager
# Hundreds of lines of mock setup
After:
# Test single handler in isolation
handler = EufyStreamHandler(mock_creds, eufy_config)
rtsp_url = handler.build_rtsp_url(camera, stream_type=stream_type)
assert rtsp_url == "rtsp://user:pass@192.168.10.84:554/live0"
| Component | Before | After | Change |
|---|---|---|---|
| Stream Manager | ~600 | ~250 | -58% |
| Device Manager | ~400 | Eliminated | -100% |
| Camera Repository | 0 | ~200 | +200 |
| PTZ Validator | 0 | ~100 | +100 |
| Stream Handlers | 0 | ~300 | +300 |
| Total | ~1000 | ~850 | -15% |
Fewer total lines with better organization and testability
| Component | Before | After |
|---|---|---|
| stream_manager.start_stream() | 15+ | 8 |
| device_manager.refresh_devices() | 20+ | Eliminated |
| Handler classes | N/A | 3-5 each |
Lower complexity = easier to understand and maintain
Before:
// Everything mixed together
{
"68d49398005cf203e400043f": {
"protect_host": "192.168.10.3", // Repeated 10x
"credentials": {
"username": "exposed_in_git",
"password": "exposed_in_git"
}
}
}
After:
// Separated by concern
// config/unifi_protect.json (infrastructure)
{
"console": {
"host": "192.168.10.3" // Once, shared by all cameras
}
}
// config/cameras.json (entities)
{
"68d49398005cf203e400043f": {
"rtsp_alias": "xyz123" // No credentials
}
}
Before:
# Hardcoded environment variable names
username = os.getenv('EUFY_CAMERA_T8416P0023352DA9_USERNAME')
password = os.getenv('EUFY_CAMERA_T8416P0023352DA9_PASSWORD')
After:
# Abstracted through provider
username, password = credential_provider.get_credentials('eufy', serial)
Before:
# Hardcoded in JSON with credentials
rtsp_url = camera_info['rtsp']['url']
# "rtsp://user:pass@192.168.10.84:554/live0"
After:
# Built dynamically from components + env vars
handler = handlers[camera_type]
rtsp_url = handler.build_rtsp_url(camera, stream_type=stream_type)
# Add database backend
class DatabaseCameraRepository(CameraRepository):
def get_camera(self, serial):
return db.query(Camera).filter_by(serial=serial).first()
# Add HashiCorp Vault
class VaultCredentialProvider(CredentialProvider):
def get_credentials(self, vendor, identifier):
return vault.read(f'cameras/{vendor}/{identifier}')
# Add recording capability
class RecordingStreamHandler(StreamHandler):
def get_ffmpeg_output_params(self):
# Add recording output in addition to HLS
return [*super().get_ffmpeg_output_params(), '-c', 'copy', 'recording.mp4']
git checkout -b backup_old_archgit checkout -b refactor_architecturestreaming/, services/credentials/cameras.json (remove credentials)app.py initialization__init__.py files*.py.oldgit commit -m "refactor: modular architecture"git checkout main && git merge refactor_architectureOnce migration is verified working:
# Deprecated files
rm device_manager.py # Replaced by camera_repository.py + ptz_validator.py
rm stream_manager.py # Replaced by stream_manager.py
# Or keep as backup
mv device_manager.py device_manager.py.deprecated
mv stream_manager.py stream_manager.py.deprecated
Status: Not fully implemented in new architecture Workaround: Manual camera configuration in cameras.json TODO: Add DeviceDiscoveryService
Status: Still uses old UniFiProtectService Workaround: Works fine for now, not a blocker TODO: Consider migrating to handler pattern
✅ Modularity: Each vendor in separate handler ✅ Testability: Components testable in isolation ✅ Maintainability: Clear separation of concerns ✅ Extensibility: Adding Reolink takes <1 hour ✅ Security: Credentials centralized and abstracted ✅ Performance: No regression in streaming ✅ Compatibility: PTZ and web UI still work
Refactoring completed by: Claude (Anthropic) Date: October 1, 2025 Architecture: Strategy Pattern + Repository Pattern + Dependency Injection Result: Clean, modular, testable, maintainable codebase ready for growth 🚀
Original refactoring attempt used monolithic AWSCredentialProvider with inconsistent interface:
get_credentials('eufy', serial)Implemented separate credential provider for each vendor based on their actual auth model:
Files Created:
services/credentials/credential_provider.py - Abstract base interfaceservices/credentials/eufy_credential_provider.py - Per-camera credentials (9 cameras)services/credentials/unifi_credential_provider.py - Console-level credentialsservices/credentials/reolink_credential_provider.py - NVR-level credentialsArchitecture Benefits:
Updated streaming/stream_manager.py to instantiate vendor-specific providers internally:
def __init__(self, camera_repo: CameraRepository):
# Create vendor-specific providers
eufy_cred = EufyCredentialProvider()
unifi_cred = UniFiCredentialProvider()
reolink_cred = ReolinkCredentialProvider()
# Initialize handlers with their specific providers
self.handlers = {
'eufy': EufyStreamHandler(eufy_cred, ...),
'unifi': UniFiStreamHandler(unifi_cred, ...),
'reolink': ReolinkStreamHandler(reolink_cred, ...)
}
Eufy (per-camera):
EUFY_CAMERA_T8416P0023352DA9_USERNAME
EUFY_CAMERA_T8416P0023352DA9_PASSWORD
EUFY_BRIDGE_USERNAME (for PTZ)
EUFY_BRIDGE_PASSWORD (for PTZ)
UniFi (console-level):
PROTECT_USERNAME
PROTECT_SERVER_PASSWORD
Reolink (NVR-level):
REOLINK_USERNAME
REOLINK_PASSWORD
Created final merged app.py combining:
Critical Routes Restored:
/api/streams/<serial>/playlist.m3u8 - HLS playlist serving/api/streams/<serial>/<segment> - HLS segment serving/api/unifi/<id>/stream/mjpeg - UniFi MJPEG streaming/api/status/mjpeg-captures - MJPEG service monitoring/api/status/unifi-monitor - Resource monitor status/api/maintenance/recycle-unifi-sessions - Session managementservices/credentials/aws_credential_provider.py - Replaced by vendor-specific providersdevice_manager.py - Replaced by camera_repository.py + ptz_validator.pystream_manager.py.old - Original monolithic version preservedHere’s a ready-to-paste continuation for DOCS/README_project_history.md, picking up from last “Next Session Priority” and covering this whole block of work.
Summary
Resolved startup and dev-reload instability by asserting streams/ ownership at app init and purging a legacy UniFi stream dir that a sync script kept recreating as root. UniFi G5 Flex now resolves its RTSP alias from env (AWS secrets) when cameras.json uses "PLACEHOLDER". Identified that the watchdog was prematurely killing legitimate streams on slow start; temporarily bypassed while we redesign health checks. Trialed FFmpeg profiles for Eufy (LL-HLS transcode vs. copy+Annex-B); will finalize after isolated probes.
Changes / Decisions
App init & ownership
app.py: stream_manager._remove_recreate_stream_dir() (leftover from pre-refactor).stream_manager._ensure_streams_directory_ownership() immediately after constructing StreamManager (guards Flask debug reloads).elfege:elfege and fail fast if root-owned.UniFi (G5 Flex) alias from env
unifi_stream_handler.build_rtsp_url(), when cameras.json has "rtsp_alias": "PLACEHOLDER", resolve via env (AWS-loaded by nvrdev), e.g. CAMERA_68d49398005cf203e400043f_TOKEN_ALIAS. Logged protect host/port/name/alias and final URL.libx264/aac), 30s timeout (µs), and added low-latency input flags where helpful.Legacy dir & sync script
streams/unifi_g5flex_1 (pre-refactor naming) kept reappearing as root; root cause: sync_wsl.sh created it.sync_wsl.sh for streams/unifi_g5flex_1 and HLS artifacts (*.ts, index.m3u8). Removed the dir; normalized perms (chown -R elfege:elfege streams && chmod -R 755 streams).Watchdog triage
continue / or ENABLE_WATCHDOG=0).RuntimeError: cannot join current thread (watchdog calling stop_stream() then attempting to join() itself). Plan: during restarts, call stop_stream(serial, stop_watchdog=False) and guard join() to never self-join.Eufy FFmpeg profiles
Proposed selectable profile via env:
EUFY_HLS_MODE=transcode: libx264 + forced keyframes every 2s (-sc_threshold 0 -force_key_frames expr:gte(t,n_forced*2)).EUFY_HLS_MODE=copy: -c:v copy -bsf:v h264_mp4toannexb (fastest; often fixes HLS black when copy is used).Tabs vs spaces hiccup
TabError (mixed indentation) in app.py after adding the ownership call. Converted leading tabs to 4 spaces and enforced .editorconfig.Known Issues
Concrete Next Steps
nvrdev AWS secrets load covers all UniFi aliases needed (and any Reolink creds).Eufy probe (outside app, watchdog OFF):
h264_mp4toannexb.
Adopt the one that yields stable, non-black playback; set EUFY_HLS_MODE accordingly.Watchdog redesign:
segment_*.ts.in_progress flag; exponential backoff (5→10→20→…≤60s).stop_watchdog=False); clear stale HLS before respawn.ENABLE_WATCHDOG (default on in prod; off in dev).Command snippets logged / used
sudo rm -rf streams/unifi_g5flex_1 && chown -R "$USER:$USER" streams && chmod -R 755 streamsexport ENABLE_WATCHDOG=0ffmpeg -rtsp_transport tcp -timeout 30000000 -i 'rtsp://192.168.10.3:7447/<alias>' -frames:v 1 -y /tmp/kitchen_probe.jpgCode notes (for traceability)
app.py: after StreamManager(...) → call _ensure_streams_directory_ownership().unifi_stream_handler.build_rtsp_url(): if rtsp_alias == "PLACEHOLDER", read CAMERA_68d49398005cf203e400043f_TOKEN_ALIAS (from AWS-loaded env) and build rtsp://{host}:{port}/{alias}.stop_stream(serial, stop_watchdog=False); guard against self-join; add per-camera restart lock/state.EUFY_HLS_MODE (transcode vs copy+Annex-B).Why this matters The architecture now respects dev reloads (no ownership flaps), uses environment-backed token resolution for UniFi, and avoids watchdog-induced churn while we finalize robust health checks. With Eufy profile selection, we can stabilize HLS across mixed vendors without over-encoding or black-frame traps.
Summary
Stabilized dev reloads and stream startup by asserting streams/ ownership on init and excluding a legacy UniFi dir recreated by a sync script. UniFi (G5 Flex) now derives its RTSP alias from env (AWS secrets) when cameras.json uses "PLACEHOLDER". The watchdog was killing legit streams during slow starts; introduced a short grace window around restarts/cleanups and outlined a single-flight restart path to avoid thrash. Added a resilient HLS cleanup routine; documented container-safe permission practices. Eufy streaming can switch between transcode (low-latency with forced keyframes) and copy+Annex-B via an env toggle, to avoid black frames on certain feeds.
What changed
Init & ownership
StreamManager._ensure_streams_directory_ownership() and also call it from app.py immediately after constructing StreamManager to survive Flask debug reloads.Legacy dir & sync script
streams/unifi_g5flex_1 as a legacy path recreated by sync_wsl.sh (and sometimes as root).
→ Excluded it (and HLS artifacts) in the sync script; removed the directory; normalized perms on streams/.UniFi (G5 Flex) token via env
unifi_stream_handler.build_rtsp_url() resolves "PLACEHOLDER" aliases from env (e.g., CAMERA_68d49398005cf203e400043f_TOKEN_ALIAS loaded by nvrdev from AWS secrets).rtsp://<protect_host>:7447/<alias>.libx264/aac), with low-latency input flags where useful.Watchdog improvements
continue or ENABLE_WATCHDOG=0).in_progress flag) and fixed “cannot join current thread” by calling stop_stream(camera_serial, stop_watchdog=False) during watchdog-initiated restarts (never join the current thread).Safer HLS cleanup
Replaced naive shutil.rmtree with _safe_rmtree:
0755.Note: No “sudo” inside the app; enforce correct UID/GID (dev: host chown; prod: container user:/entrypoint-chown).
Eufy profiles
Added EUFY_HLS_MODE env toggle:
transcode (default): libx264 + -sc_threshold 0 -force_key_frames expr:gte(t,n_forced*2) for reliable LL-HLS.copy: -c:v copy -bsf:v h264_mp4toannexb to avoid black frames on feeds that dislike transcode or require Annex-B.Turn watchdog off during tuning; pick per-site mode based on a quick standalone FFmpeg probe.
Dev ergonomics
TabError by converting tabs→spaces and added .editorconfig..env loading: Flask CLI auto-loads; python app.py should call load_dotenv() at top.Known issues
copy+Annex-B often resolves it.Next Session Priority (updated)
EUFY_HLS_MODE=transcode vs copy) using standalone FFmpeg probes.Command snippets used today
sudo rm -rf streams/unifi_g5flex_1 && chown -R "$USER:$USER" streams && chmod -R 755 streamsexport ENABLE_WATCHDOG=0ffmpeg -rtsp_transport tcp -timeout 30000000 -i 'rtsp://<host>:7447/<alias>' -frames:v 1 -y /tmp/unifi_probe.jpgSummary Implemented a comprehensive settings system with collapsible header and auto-fullscreen functionality. Refactored all JavaScript to modern ES6+ syntax, created modular jQuery-based settings architecture, and added localStorage persistence for user preferences. Fixed stream control button interaction issues and optimized viewport space usage.
What changed
transform: translateY()settings-manager.js: Main controller that orchestrates all settings functionalitysettings-ui.js: Handles UI rendering, DOM manipulation, and user interactionsfullscreen-handler.js: Business logic for fullscreen operations and state managementvar declarations with const/let for proper scoping.stream-controls from pointer-events: none to pointer-events: autoTechnical Architecture
Files Created
static/js/settings/settings-manager.js - Main settings controllerstatic/js/settings/settings-ui.js - UI rendering and event handlingstatic/js/settings/fullscreen-handler.js - Fullscreen business logicstatic/css/settings.css - Settings panel stylingFiles Modified
templates/streams.html - Added jQuery CDN, settings button, modal overlay, collapsible header checkboxstatic/css/streams.css - Fixed stream controls pointer-events, added collapsible header styleslocalStorage Schema
{
"autoFullscreenEnabled": boolean,
"autoFullscreenDelay": number (1-60)
}
Known Limitations
User Experience Improvements
Debug Features
FullscreenHandler exposed to window object for manual testingFuture Extension Points
getAllSettings() method provides centralized settings export capabilitystatic/js/app.js containing 7 mixed-responsibility classes into static/js/archive/app_20251002.jsstatic/js/utils/ - Shared utility modules (Logger, LoadingManager)static/js/controllers/ - Feature-specific controllers (PTZController)static/js/streaming/ - Stream management (existing HLS, MJPEG, MultiStream)static/js/archive/ - Deprecated code preservationFiles moved to archive (8 total):
static/js/app.js → archive/app_20251002.js (deprecated PTZ-centric interface)static/js/bridge.js → archive/bridge_20251002.jsstatic/js/camera.js → archive/camera_20251002.jsstatic/js/status.js → archive/status_20251002.jsstatic/js/loading.js → archive/loading_20251002.jsstatic/js/logger.js → archive/logger_20251002.jsstatic/js/ptz.js → archive/ptz_20251002.jstemplates/index.html → templates/archive/index_20251002.html (old PTZ control interface)Utility Modules:
static/js/utils/logger.js - Activity logging with console integration, DOM manipulation, entry trimmingstatic/js/utils/loading-manager.js - Loading overlay management with message updatesController Modules:
static/js/controllers/ptz-controller.js - PTZ camera movement controls with continuous/discrete movement support, bridge readiness validationStreaming Modules (Refactored to ES6 + jQuery):
static/js/streaming/hls-stream.js - HLS stream management with cache busting, HLS.js integration, timeout handlingstatic/js/streaming/mjpeg-stream.js - MJPEG stream management with jQuery event handling, namespaced events for cleanupstatic/js/streaming/stream.js (MultiStreamManager) - Orchestrates HLS/MJPEG managers, handles fullscreen, PTZ integration, grid layout@app.route('/') to redirect to /streams instead of rendering deprecated PTZ control interfacePTZControlForm WTForms class no longer needed after index.html deprecation/streams now serves as the main application entry point with multi-camera streaming focus.trigger('play') incompatibility with video element’s Promise-based .play() method required for autoplay prevention handling.play() for video elements while using jQuery for all other DOM manipulation$container, $element).on() with event delegation for dynamic stream elements.mjpeg and .fullscreen namespaces for clean event handler cleanupdataset.cameraSerial to jQuery’s .data('camera-serial') throughoutsync_wsl.sh background script (runs every 5 minutes) restored archived files by syncing from other machines without --delete flagremove -exact command to permanently delete archived files from all synchronized machinesremove -exact "/home/elfege/0_NVR/static/js/app.js ... /home/elfege/0_NVR/templates/index.html" (8 files)MultiStreamManager.executePTZ()/streams page with HLS/MJPEG multi-camera viewer/ PTZ control page archived but preserved for referenceFocus areas:
Work completed:
Stream Management:
start_new_session=True to ffmpeg subprocess calls to isolate process groups (PID = PGID). This allows safe cleanup with os.killpg.pkill checks.cleanup_stream_files) to avoid breaking HLS rolling buffer logic and hls.js mapping.Load Average Assessment:
UI Health Monitoring:
Tuned health monitor to be less aggressive:
sampleIntervalMs = 6000staleAfterMs = 20000consecutiveBlankNeeded = 10cooldownMs = 60000Exposed these settings as .env variables for easier tuning.
Eufy Bridge Integration:
eufy-security-server via eufy_bridge.sh.Modified script to:
config/eufy_bridge.json with AWS-fetched credentials.Please send required verification code).read -rp prompt for user to manually enter 2FA code from email, automatically POSTing to /api/verify_code.Remaining Issues:
Next steps:
Initial Investigation:
diagnostics/ffmpeg_process_monitor.py to track process lifecycle and accumulation patternsps aux output hiding RTSP URLsps aux | grep ffmpeg revealed actual scope: 40+ processes with varying ages (2min to 42min old)Process Analysis Revealed:
# High CPU UniFi processes (transcoding):
elfege 219095 65.8% ... 27:33 ffmpeg -rtsp_transport tcp -timeout 30000000 -fflags nobuffer
elfege 228849 66.6% ... 26:14 ffmpeg -rtsp_transport tcp -timeout 30000000 -fflags nobuffer
... (10+ instances for 1 camera)
# Normal CPU Eufy processes (copy mode):
elfege 219097 4.7% ... 1:59 ffmpeg -rtsp_transport tcp -timeout 30000000 -analyzeduration
... (30+ instances for 9 cameras)
The Watchdog Restart Storm:
_restart_stream() calls stop_stream(stop_watchdog=False)Process termination logic fails silently:
try:
os.killpg(os.getpgid(process.pid), SIGTERM)
process.wait(timeout=5)
except ProcessLookupError:
pass # ← SILENT FAILURE!
Exception in _watchdog_loop silently caught:
try:
self._restart_stream(camera_serial)
backoff = min(backoff * 2, 60)
except Exception: # ← Swallows all errors!
backoff = min(backoff * 2, 60)
Why No Logs Appeared:
logger.warning(f"[WATCHDOG] restarting {camera_serial}") line exists in code_watchdog_loop prevented error visibilityActive Streams Dictionary Corruption:
# Printed output showing impossible state:
68d49398005cf203e400043f # Camera appears
68d49398005cf203e400043f # DUPLICATE KEY (impossible in Python dict!)
T8416P0023352DA9
Root Cause: Concurrent modification during iteration
self.active_streams1. Process Termination Hardening (stream_manager.py):
# Terminate FFmpeg process
process = stream_info['process']
if process and process.poll() is None:
try:
os.killpg(os.getpgid(process.pid), signal.SIGTERM)
process.wait(timeout=10) # Increased from 5s
except subprocess.TimeoutExpired:
os.killpg(os.getpgid(process.pid), signal.SIGKILL)
process.wait(timeout=2) # Give SIGKILL time to work
except ProcessLookupError:
pass
# Verify process actually dead before removing from tracking
if process and process.poll() is None:
# Process still alive despite kill attempts
logger.error(f"Failed to kill FFmpeg for {camera_name} (PID: {process.pid})")
return False # DON'T remove from dictionary
else:
# Process confirmed dead
self.active_streams.pop(camera_serial, None)
logger.info(f"Stopped stream for {camera_name}")
return True
2. Thread-Safe Dictionary Iteration:
# Snapshot keys before iterating to avoid modification-during-iteration
active_keys = list(self.active_streams.keys())
for stream in active_keys:
print(stream)
3. Improved FFmpeg Cleanup Utility (cleanup_handler.py):
def kill_ffmpeg():
for attempt in range(50):
try:
# Use pgrep -f (not pkill -0) for full command line matching
if subprocess.run(["pgrep", "-f", "ffmpeg.*-rtsp"]).returncode == 0:
subprocess.run(
["pkill", "-f", "ffmpeg.*-rtsp"], # With -f flag for full match
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL
)
time.sleep(0.5)
else:
print("✅ No ffmpeg processes left")
break
except:
print(traceback.print_exc())
raise Exception(f"❌ ffmpeg Cleanup error")
Key Learning: pkill -0 only matches process names (15 char limit), not full command lines. Must use pgrep -f for pattern matching against full command.
Next Session Priorities:
start_new_session=Truepkill if os.killpg() failsENABLE_WATCHDOG=1 # Currently enabled
EUFY_HLS_MODE=copy # Low CPU mode
# FLASK_DEBUG not set (production mode)
pkill vs pgrep semantics differ - understand tool limitationsstreaming/stream_manager.py - Process termination logic hardenedlow_level_handlers/cleanup_handler.py - Fixed kill_ffmpeg() to use pgrep -fdiagnostics/ffmpeg_process_monitor.py - Created (process lifecycle tracking tool)Session completed: October 4, 2025 13:30 Status: Root cause identified, partial fixes implemented, testing in progress Next Session: Monitor process accumulation with fixes, implement remaining hardening
Design Philosophy: Resolution should adapt to display context - grid view needs lower resolution than fullscreen
stream_type='sub'): Low resolution/framerate for thumbnail displaystream_type='main'): Full resolution for detailed viewing1. Flask Route Modification (app.py line ~220)
# Extract stream type from request (defaults to 'sub' for grid view)
data = request.get_json() or {}
stream_type = data.get('type', 'sub') # 'main' or 'sub'
# Start the stream with specified type
stream_url = stream_manager.start_stream(camera_serial, stream_type=stream_type)
2. Stream Manager Enhancement (stream_manager.py)
start_stream() method signature: def start_stream(self, camera_serial: str, stream_type: str = 'sub')_start_ffmpeg() to accept and pass stream_type parameterhandler.get_ffmpeg_output_params(stream_type=stream_type)3. Stream Handler Updates
Eufy Camera Handler (eufy_stream_handler.py):
def get_ffmpeg_output_params(self, stream_type: str = 'sub') -> List[str]:
"""
IMPORTANT: Eufy cameras via RTSP output 1920x1080 (NOT 2.5K from app)
- Copy mode: 11fps @ full resolution (cannot scale)
- Transcode sub: 6fps @ 640x360 (grid view for old iPads)
- Transcode main: 30fps @ native 1920x1080 (fullscreen)
"""
Resolution Choices Rationale:
stream_type)UniFi Camera Handler (unifi_stream_handler.py):
Before (all cameras at 1920x1080@30fps transcode):
After (grid at 640x360@6fps):
RTSP Resolution Limitation:
FFmpeg Copy Mode Constraints:
-c:v copy) cannot apply resolution scaling or framerate changes-vf scale=WIDTHxHEIGHT-r in copy mode only drops frames, doesn’t re-encodeCurrent State: Backend fully implemented and ready
Pending: Frontend hls-stream.js modification to send stream_type parameter
Default Behavior: All streams currently request type: 'sub' (low resolution)
Next Step: Implement fullscreen detection to request type: 'main'
Problem: 6-7 second latency vs 1-2 seconds with UniFi Protect direct streaming
Root Cause Analysis:
Implemented Fix:
# Changed from:
'-hls_time', '2', '-hls_list_size', '10'
# Changed to:
'-hls_time', '1', '-hls_list_size', '3'
Results:
Further Optimization Options Identified (Not Implemented):
maxBufferLength: 2, liveSyncDurationCount: 1app.py - Added stream_type parameter extraction from requeststreaming/stream_manager.py - Enhanced to support stream_type routingstreaming/handlers/eufy_stream_handler.py - Implemented multi-resolution transcodingstreaming/handlers/unifi_stream_handler.py - Implemented multi-resolution transcodingstreaming/stream_handler.py - Updated abstract method signature (not shown in diff)type: 'main'Race Condition in Active Streams Logging:
is_stream_healthy() simultaneously at 10-second intervalsself.active_streams dictionary accessed by multiple threads without synchronizationDictionary Corruption Symptoms:
Missing Master Lock for Shared State:
self._restart_locks) existed but only prevented duplicate restart operationsself.active_streams dictionary itselfstart_stream() - checking/writing active streamsstop_stream() - reading/removing entriesis_stream_healthy() - reading stream metadata_watchdog_loop() - checking stream existence_restart_stream() - writing new stream entriesget_stream_url(), is_stream_alive() - reading stream dataWatchdog Deadlock Discovery:
def _watchdog_loop(self, camera_serial: str, stop_event: threading.Event) -> None:
while not stop_event.is_set():
with self._streams_lock: # ← HOLDING LOCK DURING SLEEP!
time.sleep(max(5, min(backoff, 60))) # ← BLOCKS EVERYTHING FOR 5-60 SECONDS
# ... health checks ...
Impact:
self._streams_lock for 5-60 seconds during sleep1. Master Lock for Shared Dictionary (__init__):
# CRITICAL: Master lock for thread-safe access to shared state
self._streams_lock = threading.RLock() # RLock allows re-entrance from same thread
2. Protected Dictionary Access Methods:
start_stream() - Wrapped dict writes in lockstop_stream() - Protected read/remove operationsget_stream_url() - Added lock for dict accessis_stream_alive() - Added lock for dict accessget_active_streams() - Already had lock (preserved)stop_all_streams() - Already had lock (preserved)_wait_for_playlist() - Added lock for dict access3. Rate-Limiting Lock for Logging:
self.last_log_active_streams = time.time()
self._log_lock = threading.Lock() # Separate lock for log throttling
def printout_active_streams(self, caller="Unknown"):
with self._log_lock:
if time.time() - self.last_log_active_streams >= 10:
self.last_log_active_streams = time.time()
# ... print logic ...
4. Critical Watchdog Fix - Sleep Outside Lock:
def _watchdog_loop(self, camera_serial: str, stop_event: threading.Event) -> None:
while not stop_event.is_set():
# SLEEP FIRST, OUTSIDE THE LOCK
time.sleep(max(5, min(backoff, 60)))
# Then acquire lock only for quick checks
with self._streams_lock:
if stop_event.is_set() or camera_serial not in self.active_streams:
break
# ... rest of health checking logic ...
5. Watchdog Cleanup Logic Correction:
def stop_stream(self, camera_serial: str, stop_watchdog: bool = True) -> bool:
# Stop watchdog flag BEFORE lock
if stop_watchdog and camera_serial in self.stop_flags:
self.stop_flags[camera_serial].set()
with self._streams_lock:
# ... process termination ...
self.active_streams.pop(camera_serial, None)
# Watchdog thread join OUTSIDE lock (after restart case check)
if stop_watchdog and camera_serial in self.watchdogs:
t = self.watchdogs.get(camera_serial)
if t and t.is_alive() and threading.current_thread() is not t:
t.join(timeout=3)
self.watchdogs.pop(camera_serial, None)
self.stop_flags.pop(camera_serial, None)
Critical Rules:
list(self.active_streams.keys())streaming/stream_manager.py - Added master lock, fixed watchdog sleep, protected all dict accessstop_stream()stop_watchdog=False semantics (used during restarts from within watchdog thread)caller parameter to is_stream_healthy() for better debuggingTime: October 4, 2025 - Afternoon (Multi-Resolution) + Evening (Thread Safety) Status: Both critical improvements implemented and stable Achievements:
Next Session:
Multiple Concurrent Problems:
bufferAppendError in HLS.js - Browser rejecting video segments (MediaSource Extensions incompatibility)active_streamsBackend Watchdog: DISABLED ✓ (confirmed via [WATCHDOG] DISABLED in logs)
Frontend Health Monitor: ACTIVE (the actual culprit)
sampleIntervalMs: 2000)staleAfterMs: 20000)warmupMs: 60000)onUnhealthy callback when:
The Cascade Pattern:
playlist.m3u8 before FFmpeg creates it → 404bufferAppendError/api/stream/stop (returns 400 if stream already stopped)/api/stream/start againbufferAppendError1. Added _kill_all_ffmpeg_for_camera() method to StreamManager:
def _kill_all_ffmpeg_for_camera(self, camera_serial: str) -> bool:
"""Kill all FFmpeg processes for a camera using pkill with full path matching"""
try:
check = subprocess.run(['pgrep', '-f', f'streams/{camera_serial}'], ...)
if check.returncode != 0:
return True # No processes found
subprocess.run(['pkill', '-9', '-f', f'streams/{camera_serial}'], ...)
time.sleep(0.5)
verify = subprocess.run(['pgrep', '-f', f'streams/{camera_serial}'], ...)
return verify.returncode != 0 # True if all killed
except Exception as e:
logger.error(f"Error killing FFmpeg: {e}")
return False
2. Simplified stop_stream() to use new kill method:
def stop_stream(self, camera_serial: str, stop_watchdog: bool = True) -> bool:
with self._streams_lock:
if camera_serial not in self.active_streams:
return False
# Kill ALL FFmpeg for this camera (handles orphans)
if not self._kill_all_ffmpeg_for_camera(camera_serial):
logger.error(f"Failed to kill FFmpeg for {camera_name}")
return False
# Remove from tracking (no segment cleanup per October 3 decision)
self.active_streams.pop(camera_serial, None)
logger.info(f"Stopped stream for {camera_name}")
# Join watchdog outside lock
if stop_watchdog and camera_serial in self.watchdogs:
# ... existing watchdog cleanup logic
return True
3. Added _clear_camera_segments() utility method (not called automatically):
self.hls_dir / camera_serial pathSymptoms visible in logs:
Failed to delete segment_044.ts: [Errno 2] No such file or directory - Race condition evidencebufferAppendErrorHigh Priority:
/api/stream/start calls for same camera-profile:v baseline -level 3.1 -pix_fmt yuv420pMedium Priority:
warmupMs: 60000 setting, health checks appear to trigger immediately{success: false} instead of 400 when stream not in active_streamspkill -f with full path (streams/{serial}) correctly matches FFmpeg processesstreaming/stream_manager.py - Added _kill_all_ffmpeg_for_camera(), simplified stop_stream(), added _clear_camera_segments()bufferAppendError in console-profile:v baseline -level 3.1 -pix_fmt yuv420p)ffprobe on segments to confirm codec profile issuesSession Status: Problems diagnosed but not fully resolved - bufferAppendError still occurring despite process cleanup improvements
New Error Pattern Identified:
error: Error: media sequence mismatch 9
details: 'levelParsingError'
This is different from bufferAppendError - HLS.js is rejecting playlists because the segment sequence numbers don’t match what it cached from previous FFmpeg instances.
Your observation is correct - deleting segments during stop_stream() breaks HLS.js internal state:
media sequence mismatch → rejects the segmentThe segment deletion race happens when:
_clear_camera_segments() runs WHILE frontend still has cached playlist from old FFmpegFrontend needs to destroy and recreate HLS.js instance when restarting streams:
In hls-stream.js, the forceRefreshStream() method already exists but isn’t being called by the health monitor:
forceRefreshStream(cameraId, videoElement) {
// Destroy existing HLS instance
const existingHls = this.hlsInstances.get(cameraId);
if (existingHls) {
existingHls.destroy(); // ← This clears internal cache
this.hlsInstances.delete(cameraId);
}
const stream = this.activeStreams.get(cameraId);
if (stream) {
stream.element.src = '';
stream.element.load();
this.activeStreams.delete(cameraId);
}
setTimeout(() => {
this.startStream(cameraId, videoElement);
}, 500);
}
But restartStream() in stream.js doesn’t call this - it just calls stop then start, leaving HLS.js with stale cache.
High Priority:
restartStream() to call forceRefreshStream() instead of stop+start_clear_camera_segments() calls - let FFmpeg handle cleanup via -hls_flags delete_segmentsDiagnostic Needed:
.m3u8?t=1759629892588 appearing multiple times) indicates frontend making duplicate concurrent requestsAdded to end of existing October 4 entry:
Frontend HLS.js Cache Issue Discovery:
media sequence mismatch - HLS.js rejecting segments due to stale cachehls.destroy() before restarting streams to flush cacheforceRefreshStream() method exists but not used by health monitor’s restartStream()-hls_flags delete_segments onlyStatus: Root cause identified, fix requires frontend changes to health monitor restart logic
Problem Identified: .ts segment 404 errors causing stream failures
delete_segments flag creating race conditionSolution Implemented: Buffer-based deletion instead of aggressive cleanup
# Changed from:
-hls_flags delete_segments+split_by_time
# To:
-hls_flags append_list
-hls_delete_threshold 1 # Keep 1 extra segment as safety buffer
Results:
Discovery: Different camera types need different segment lengths for optimal performance
Eufy Cameras (optimized for 1-second segments):
EUFY_HLS_SEGMENT_LENGTH=1
EUFY_HLS_LIST_SIZE=1
EUFY_HLS_DELETE_THRESHOLD=1
Result: ~2-3 second latency
UniFi Protect Cameras (need 2-second segments):
UNIFI_HLS_SEGMENT_LENGTH=2
UNIFI_HLS_LIST_SIZE=1
UNIFI_HLS_DELETE_THRESHOLD=1
Result: ~3-4 second latency
Why the difference: UniFi streams are pre-optimized H.264 from camera hardware; Eufy cameras stream less-optimized RTSP that benefits from faster segment generation.
Problem: Health monitor stuck in perpetual warmup, never monitoring streams
[Health] T8416P0023390DE9: In warmup period (20000ms), skipping health checksRoot Cause in health.js:
// WRONG: Returns empty detach function, never starts timer
if (performance.now() < t.warmupUntil) {
return () => { }; // ← BUG: No monitoring ever happens
}
startTimer(serial, fn); // Never reached during warmup
Fix Applied: Move warmup check inside timer callback
// CORRECT: Timer always runs, but skips checks during warmup
startTimer(serial, () => {
// Check warmup INSIDE timer callback
if (performance.now() < t.warmupUntil) {
console.log(`[Health] ${serial}: In warmup period, skipping checks`);
return; // Skip this check but timer keeps running
}
// ... actual health checks (stale detection, blank frame detection)
});
Applied to both:
attachHls() - HLS video stream monitoringattachMjpeg() - MJPEG image stream monitoringResults:
Discovered: 17 defunct FFmpeg processes from previous sessions
[ffmpeg] <defunct> # Zombie processes consuming CPU
Cleanup:
pkill -9 ffmpeg # Killed all zombies
Prevention: Health monitor now properly restarts streams without creating zombies
Server Load (56-core Dell PowerEdge R730xd):
Chrome Browser:
Stream Quality:
Environment Variables:
# Eufy Settings
EUFY_HLS_SEGMENT_LENGTH=1
EUFY_HLS_LIST_SIZE=1
EUFY_HLS_DELETE_THRESHOLD=1
# UniFi Settings
UNIFI_HLS_SEGMENT_LENGTH=2
UNIFI_HLS_LIST_SIZE=1
UNIFI_HLS_DELETE_THRESHOLD=1
# Health Monitor
UI_HEALTH_WARMUP_MS=10000 # 10 seconds
UI_HEALTH_ENABLED=1
ENABLE_WATCHDOG=0
streaming/stream_manager.py - Updated FFmpeg HLS flags for both Eufy and UniFi handlersstatic/js/streaming/health.js - Fixed warmup timer logic in attachHls() and attachMjpeg().env - Camera-specific segment lengths and health monitor settingsLesson learned: Foundation stability takes precedence over feature additions. The debugging work was necessary - unstable streams would have made all new features unusable.
Problem 1: Browser requesting playlists before FFmpeg creates them
playlist.m3u8 during startup/restartSolution: Added retry logic with exponential backoff
// In hls-stream.js error handler
if (data.details === 'manifestLoadError' && data.response?.code === 404) {
const retries = this.retryAttempts.get(cameraId) || 0;
if (retries < 20) { // High count for dev environment
console.log(`[HLS] Playlist 404 for ${cameraId}, retry ${retries + 1}/20`);
this.retryAttempts.set(cameraId, retries + 1);
setTimeout(() => {
hls.loadSource(playlistUrl);
}, 6000);
return;
}
}
Problem 2: Stream status stuck at ‘failed’ after manual restart
forceRefreshStream() not awaiting completion of startStream()setTimeout() not being awaited, causing premature completionSolution: Made forceRefreshStream() properly async
async forceRefreshStream(cameraId, videoElement) {
// Destroy existing HLS instance
const existingHls = this.hlsInstances.get(cameraId);
if (existingHls) {
existingHls.destroy();
this.hlsInstances.delete(cameraId);
}
// Clear active stream
const stream = this.activeStreams.get(cameraId);
if (stream) {
stream.element.src = '';
stream.element.load();
this.activeStreams.delete(cameraId);
}
// Wait brief delay, then restart
await new Promise(resolve => setTimeout(resolve, 500));
return await this.startStream(cameraId, videoElement);
}
And updated restartStream() to set status after completion:
if (streamType === 'll_hls') {
await this.hlsManager.forceRefreshStream(serial, videoElement);
this.setStreamStatus($streamItem, 'live', 'Live');
}
Results:
static/js/streaming/hls-stream.js - Added retry logic, made forceRefreshStream asyncstatic/js/streaming/stream.js - Added status update after restart completionSession Status: All major issues resolved - streams stable, latency optimized, health monitor working
Issues encountered during this session prevented implementation of planned features. The following items remain on the backlog:
Goal: Auto-stop all streams when backend becomes unavailable
/ or /api/health)Goal: Non-dismissible modal overlay when server unreachable
Goal: Individual stream configuration via right-click context menu
cameras.json or separate configPriority: HIGH - needed to replace Blue Iris on iPads
streaming/handlers/reolink_stream_handler.pycameras.jsonGoal: Replace web interface with native Apple app
Intended work: Reolink integration, UI improvements, per-camera settings Actual work: Debugging segment 404s, fixing health monitor warmup, optimizing latency
Add this section to README:
Converted Settings Modules from IIFE to ES6 + jQuery Pattern
Refactored all three settings modules to match project standards established in ptz-controller.js:
Files Converted:
static/js/settings/fullscreen-handler.js - ES6 class with singleton exportstatic/js/settings/settings-ui.js - ES6 class with singleton exportstatic/js/settings/settings-manager.js - ES6 class with singleton exportKey Changes:
export class with singleton instancesimport { fullscreenHandler } from './fullscreen-handler.js')$(document).ready() initialization (no vanilla addEventListener)window.FullscreenHandler exposure for debuggingthis.initialized flagHTML Module Loading:
Updated streams.html to load settings scripts as ES6 modules:
<script type="module" src="...fullscreen-handler.js"></script>
<script type="module" src="...settings-ui.js"></script>
<script type="module" src="...settings-manager.js"></script>
Bug Fix - Settings Button Click Handler:
Issue: Settings button unresponsive after ES6 conversion
Root cause: Module async loading + missing e.preventDefault() on button clicks
Resolution: Added event preventDefault and improved initialization order
Fullscreen Toggle Icon Button: Added minimalist fullscreen icon in header next to settings gear:
<i id="fullscreen-toggle-btn" class="fas fa-expand header-icon-btn" title="Toggle Fullscreen"></i>
CSS Styling (streams.css):
.header-icon-btn {
font-size: 20px;
color: #ffffff;
opacity: 0.7;
cursor: pointer;
transition: opacity 0.2s, transform 0.2s;
}
fullscreen-handler.js via setupHeaderButton() methodProfessional Button Style:
Created .btn-beserious class for serious, non-cartoonish UI elements:
.btn-beserious {
background: #2d3748; /* Dark slate gray */
border: 1px solid #4a5568;
box-shadow: 0 1px 3px rgba(0, 0, 0, 0.3);
}
New Setting: Grid Style Toggle Added user-configurable grid layout modes with localStorage persistence:
Modes:
Implementation:
fullscreen-handler.js additions:
this.settings = {
autoFullscreenEnabled: false,
autoFullscreenDelay: 3,
gridStyle: 'spaced' // NEW
};
setGridStyle(style) { ... }
applyGridStyle() { ... }
settings-ui.js - HTML dropdown control:
<select id="grid-style-select" class="setting-select">
<option value="spaced">Spaced & Rounded</option>
<option value="attached">Attached (NVR Style)</option>
</select>
streams.css - Attached mode styling:
.streams-container.grid-attached {
gap: 0;
}
.streams-container.grid-attached .stream-item {
border-radius: 0;
box-shadow: none;
border: 1px solid #1a1a1a;
}
Per-Stream Fullscreen Button: Replaced unreliable click zones with dedicated fullscreen buttons on each stream.
Problem: Touch events on .stream-video and .stream-overlay failed on iOS/Android
Solution: Visible button overlay with proper touch target sizing
streams.html template addition:
<button class="stream-fullscreen-btn"
aria-label="Enter fullscreen"
title="Fullscreen">
<i class="fas fa-expand"></i>
</button>
streams.css implementation:
.stream-fullscreen-btn {
position: absolute;
top: 0.5rem;
right: 0.5rem;
width: 44px; /* iOS minimum touch target */
height: 44px;
opacity: 0; /* Hidden on desktop hover */
}
@media (hover: none) {
.stream-fullscreen-btn {
opacity: 0.7; /* Always visible on touch devices */
}
}
Behavior:
iPad Mini Grid Layout Fixes:
Issue: Vertical stacking in landscape mode (1024px width) Resolution: Added specific iPad landscape media query:
@media (min-width: 769px) and (max-width: 1024px) and (orientation: landscape) {
.grid-3, .grid-4, .grid-5 {
grid-template-columns: repeat(3, 1fr) !important;
}
}
Portrait Mode Grid Optimization:
Previous behavior: Forced single column below 600px New behavior: 2-column grid maintained on all phones in portrait
@media (max-width: 600px) {
.grid-2, .grid-3, .grid-4, .grid-5 {
grid-template-columns: repeat(2, 1fr) !important;
}
gap: 0.25rem; /* Reduced for space efficiency */
}
Benefits:
Meta Tags Added to streams.html:
<meta name="apple-mobile-web-app-capable" content="yes">
<meta name="apple-mobile-web-app-status-bar-style" content="black-translucent">
<meta name="apple-mobile-web-app-title" content="Camera Streams">
Behavior When Launched from iOS Home Screen:
Limitations Noted:
Button Styles Source Identified:
All .btn-* classes (.btn-success, .btn-danger, .btn-primary, etc.) are custom CSS in streams.css, not Bootstrap or Axios.
Bootstrap naming convention adopted but implemented as lightweight custom styles:
.btn { padding: 0.5rem 1rem; border: none; ... }
.btn-success { background: #28a745; }
.btn-danger { background: #dc3545; }
Benefits over Bootstrap:
document.addEventListener)JavaScript:
static/js/settings/fullscreen-handler.js - Full ES6 rewrite, grid style featurestatic/js/settings/settings-ui.js - Full ES6 rewrite, grid style UIstatic/js/settings/settings-manager.js - Full ES6 rewritestatic/js/streaming/stream.js - Event handler change for fullscreen buttonCSS:
static/css/streams.css - Header icon buttons, grid-attached mode, mobile media queries, fullscreen button overlaystatic/css/settings.css - Select dropdown stylingHTML:
templates/streams.html - ES6 module loading, iOS meta tags, fullscreen button per stream, header icon{
"autoFullscreenEnabled": boolean,
"autoFullscreenDelay": number (1-60),
"gridStyle": string ("spaced" | "attached")
}
Understood. Everything’s reverted and working on modern browsers again.
For the README update, here’s what to document from today’s session:
Issue: iPad Mini landscape (1024px × 768px) displayed streams stacked vertically instead of 3-column grid.
Root Cause: Media query boundary conditions and viewport quirks on older iOS Safari.
Solution: Broadened media query range to catch edge cases:
/* iPad Mini and similar tablets (portrait or landscape) */
@media screen and (min-width: 700px) and (max-width: 1100px) {
.streams-container {
display: grid !important;
gap: 0.5rem;
grid-template-columns: repeat(3, 1fr) !important;
grid-auto-rows: minmax(0, 1fr) !important;
}
.stream-item {
min-height: 0;
height: 100%;
}
}
Result: 3-column grid now renders correctly on iPad Mini in both orientations.
Attempted: Legacy JavaScript support for iPad Mini running iOS 12.5.7 (final supported iOS version for this hardware).
Challenges Encountered:
Outcome: iOS 12.5.7 support deemed not worth the maintenance burden. Modern browsers (iOS 13+, Chrome, Firefox, Edge, Safari 13+) work perfectly with current ES6 + jQuery architecture.
Issue: Fullscreen button unclickable on mobile for cameras with PTZ controls Cause: PTZ controls layer (z-index: 20) blocking fullscreen button (z-index: 15) Fix: Increased fullscreen button z-index to 25, ensuring it renders above all control layers
Design Decision: Implement hidden boolean attribute at camera configuration level rather than filtering logic scattered across codebase
cameras.json configuration fileCameraRepository class1. CameraRepository Filtering Layer (services/camera_repository.py):
def _filter_hidden(self, cameras: Dict[str, Dict], include_hidden: bool = False) -> Dict[str, Dict]:
"""
Filter out hidden cameras unless explicitly requested
Default behavior: exclude hidden cameras from all operations
"""
if include_hidden:
return cameras
return {
serial: config
for serial, config in cameras.items()
if not config.get('hidden', False)
}
2. app.py Filtering Layer (services/camera_repository.py):
app.py:
@app.route('/api/stream/start/<camera_serial>', methods=['POST'])
@csrf.exempt
def api_stream_start(camera_serial):
"""Start HLS stream for camera"""
try:
# Get camera (includes hidden cameras)
camera = camera_repo.get_camera(camera_serial)
Early rejection
if not camera or camera.get('hidden', False):
logger.warning(f"API access denied: Camera {camera_serial} not found or hidden")
return jsonify({
'success': False,
'error': 'Camera not found or not accessible'
}), 404
3. Streaming manager filtering layer (streaming/stream_manager.py):
def start_stream(self, camera_serial: str, stream_type: str = 'sub') -> Optional[str]:
with self._streams_lock:
if camera_serial in self.active_streams and self.is_stream_alive(camera_serial):
print("═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-")
print(f"Stream already active for {camera_serial}")
print("═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-═-")
return self.get_stream_url(camera_serial)
# Get camera configuration
camera = self.camera_repo.get_camera(camera_serial)
if not camera:
logger.error(f"Camera {camera_serial} not found")
return None
camera_name = camera.get('name', camera_serial)
camera_type = camera.get('type', '').lower()
try:
hidden_camera = camera.get('hidden', False)
if hidden_camera:
print(f"{camera_name} is hidden. Skipping.")
return None
except Exception as e:
print(traceback.print_exc())
print(e)
# Check streaming capability
etc.
Here’s the README_project_history.md update for October 5-6:
Successfully integrated 7 Reolink cameras using native dual-stream capability (main/sub channels). Implemented URL encoding for special characters in passwords, added configurable transcode/copy modes, and resolved architecture inconsistencies around credential providers and stream type parameters.
Total: 7 cameras added to system (4 PTZ, 3 fixed)
| Camera | IP | MAC | PTZ | Status |
|---|---|---|---|---|
| MEBO_CAMERA | 192.168.10.121 | 68:39:43:BD:A5:6F | Yes | ✅ Streaming |
| CAT_FEEDER_CAM_2 | 192.168.10.122 | E0:E2:E6:0C:50:F0 | Yes | ✅ Streaming |
| CAT_FEEDERS_CAM_1 | 192.168.10.123 | 44:EF:BF:27:0D:30 | Yes | ✅ Streaming |
| Living_Reolink | 192.168.10.186 | EC:71:DB:AD:0D:70 | Yes | ✅ Streaming |
| REOLINK_formerly_CAM_STAIRS | 192.168.10.187 | b0:41:1d:5c:e8:7a | No | ✅ Streaming |
| CAM_OFFICE | 192.168.10.88 | ec:71:db:3e:93:f5 | No | ✅ Streaming |
| CAM_TERRACE | 192.168.10.89 | ec:71:db:c3:1a:14 | No | ✅ Streaming |
Total system cameras: 17 (1 UniFi + 9 Eufy + 7 Reolink)
Option A vs Option B Analysis:
rtsp://...@IP:554/h264Preview_01_main (1920x1080 @ 30fps)rtsp://...@IP:554/h264Preview_01_sub (640x480 @ 15fps)Selected: Option A with optional transcode mode (best of both worlds)
NOTE:: Transcode mode could be beneficial as it allows to reduce resolution. Some clients (ipads etc.) can benefit from this in grid mode. 17 cameras in the grid @ 640 resolution is too taxing. Best to be able to lower the grid per-stream/window resolution in this case. This can’t be done while using option A.
1. config/reolink.json:
{
"rtsp": {
"port": 554,
"stream_path_main": "/h264Preview_01_main",
"stream_path_sub": "/h264Preview_01_sub"
},
"hls": {
"segment_length": 2,
"list_size": 1,
"delete_threshold": 1
}
}
2. config/cameras.json additions:
All 7 Reolink cameras added with:
"type": "reolink""host": "192.168.10.XXX" (per-camera IP)"capabilities": ["streaming"] or ["streaming", "ptz"]"hidden": false"channel field needed (direct camera access, not NVR)3. Environment variables:
REOLINK_USERNAME=admin
REOLINK_PASSWORD=TarTo56))#FatouiiDRtu
REOLINK_HLS_MODE=copy # or 'transcode'
RESOLUTION_MAIN=1280x720 # optional, transcode mode only
RESOLUTION_SUB=320x180 # optional, transcode mode only
1. streaming/handlers/reolink_stream_handler.py:
Key Features:
StreamHandler base class (inherits self.credential_provider and self.vendor_config)build_rtsp_url() accepts stream_type parameter to choose main vs sub path), #, etc.)Critical Bug Fixed:
# WRONG - handler had custom __init__() that broke inheritance:
def __init__(self):
username = os.getenv('REOLINK_USERNAME')
# This prevented parent class from setting self.credential_provider!
# CORRECT - removed custom __init__, parent handles it:
class ReolinkStreamHandler(StreamHandler):
# No __init__ needed, inherits from parent
URL Encoding Fix:
from urllib.parse import quote
# Build RTSP URL with encoded password
rtsp_url = f"rtsp://{username}:{quote(password, safe='')}@{host}:{port}{stream_path}"
This converts special characters:
) → %29# → %23Preventing FFmpeg from misinterpreting password as URL delimiters.
2. Stream Type Parameter Propagation:
Updated all handlers to accept stream_type parameter in build_rtsp_url():
eufy_stream_handler.py: Added parameter (ignored, single RTSP URL)unifi_stream_handler.py: Added parameter (ignored, single RTSP URL)reolink_stream_handler.py: Uses parameter to choose main/sub URL pathUpdated stream_manager.py:
# Now passes stream_type to all handlers
rtsp_url = handler.build_rtsp_url(camera, stream_type=stream_type)
3. Credential Provider Architecture Clarification:
Each handler receives its OWN credential provider instance:
# In StreamManager.__init__():
eufy_cred = EufyCredentialProvider()
unifi_cred = UniFiCredentialProvider()
reolink_cred = ReolinkCredentialProvider() # ← Separate instance
self.handlers = {
'eufy': EufyStreamHandler(eufy_cred, ...), # Gets Eufy provider
'unifi': UniFiStreamHandler(unifi_cred, ...), # Gets UniFi provider
'reolink': ReolinkStreamHandler(reolink_cred, ...) # Gets Reolink provider
}
ReolinkCredentialProvider.get_credentials():
REOLINK_USERNAME and REOLINK_PASSWORD from environment(username, password) tupleCopy Mode (default - REOLINK_HLS_MODE=copy):
-c:v copy # No re-encoding, ~5% CPU per stream
stream_type chooses URL path (main or sub stream)Transcode Mode (REOLINK_HLS_MODE=transcode):
-c:v libx264 -vf scale=320x180 # Re-encodes, ~15% CPU per stream
RESOLUTION_SUB / RESOLUTION_MAINCRITICAL: Cannot mix -c:v copy with -vf scale=...
1. Parent Class Initialization:
__init__() from parent automatically__init__() without calling super().__init__() breaks inheritanceStreamHandler.__init__() sets self.credential_provider - don’t override!2. URL Encoding in RTSP:
urllib.parse.quote(password, safe='') to encode# as URL fragment delimiter3. Method Signature Compatibility:
build_rtsp_url(self, camera_config: Dict)build_rtsp_url(self, camera_config: Dict, stream_type: str = 'sub')4. Dependency Injection Flow:
StreamManager creates providers → passes to handlers →
handlers store in self.credential_provider →
build_rtsp_url() calls self.credential_provider.get_credentials()
With 17 cameras (all streaming in grid view):
Before (only Eufy + UniFi):
After (adding 7 Reolink in copy mode):
Transcode mode for all would be:
CPU savings from copy mode: ~70% reduction vs transcode
New:
Modified:
streaming/handlers/reolink_stream_handler.py:
__init__() (fixed inheritance)REOLINK_HLS_MODE toggle (copy/transcode)stream_type parameter to build_rtsp_url()streaming/handlers/eufy_stream_handler.py:
stream_type parameter to build_rtsp_url() signature (ignored)streaming/handlers/unifi_stream_handler.py:
stream_type parameter to build_rtsp_url() signature (ignored)streaming/stream_manager.py:
build_rtsp_url() call to pass stream_type parameterconfig/cameras.json:
total_devices from 10 to 17RESOLUTION_SUB=320x180) added to reduce bandwidth on old iPadscameras.jsonSession completed: October 6, 2025 ~2:30 AM Status: Reolink integration complete, copy mode working, transcode mode available as fallback
Adding to the end of the file:
Diagnosed and resolved Reolink camera streaming issues through systematic hardware troubleshooting. Root cause identified as network switch packet corruption rather than camera/software issues. Implemented per-camera HLS configuration override system in cameras.json for granular stream tuning across 17-camera deployment.
Initial Symptoms:
Invalid data found when processing input or infinite Non-monotonous DTS errorsInitial Hypothesis Tree:
Systematic Testing Methodology:
# Test 1: Basic connectivity
ping -c 10 192.168.10.89
# Result: ✅ 0% packet loss, <1ms latency
# Test 2: RTSP stream probe
ffprobe -rtsp_transport tcp -i "rtsp://admin:password@192.168.10.89:554/h264Preview_01_sub"
# Result: ❌ Massive H.264 decoding errors (1136+ DC/AC/MV errors per frame)
# Test 3: 30-second capture test
timeout 35 ffmpeg -rtsp_transport tcp -i "rtsp://..." -t 30 -c copy test.mp4
# Result: ❌ Connection timeout or 0-byte output
# Test 4: After network switch change
timeout 35 ffmpeg -rtsp_transport tcp -i "rtsp://..." -t 30 -c copy test.mp4
# Result: ✅ 871kB file, clean 30-second capture
Network Topology Analysis:
Root Cause: Netgear managed switch corrupting RTSP packets despite:
Resolution: Moved TERRACE camera to unmanaged PoE switch, immediately resolved all streaming issues.
Problem Statement:
Latency Analysis:
# Reolink configuration (18s latency):
REOLINK_HLS_SEGMENT_LENGTH=2 # 2-second segments
REOLINK_HLS_LIST_SIZE=3 # 3 segments in playlist = 6s buffer
REOLINK_HLS_DELETE_THRESHOLD=5 # Keep 5 extra segments = 10s buffer
# Total buffering: 6s + 10s + 2s encoding/network = 18 seconds
# Eufy configuration (2-4s latency):
EUFY_HLS_SEGMENT_LENGTH=1 # 1-second segments
EUFY_HLS_LIST_SIZE=1 # 1 segment in playlist
EUFY_HLS_DELETE_THRESHOLD=1 # Minimal buffering
# Total buffering: 1s + 1s + 2s encoding/network = 4 seconds
Key Discovery: Eufy handlers included -force_key_frames 'expr:gte(t,n_forced*2)' parameter that Reolink lacked. This forces I-frames every 2 seconds, allowing HLS.js to start playback immediately without waiting for natural keyframes (which can be 10+ seconds apart on some cameras).
FFmpeg Parameter Comparison:
| Parameter | Eufy (2-4s) | Reolink (18s) | Impact |
|---|---|---|---|
segment_length |
1 | 2 | Browser must wait for complete segment |
list_size |
1 | 3 | Playlist buffer multiplier |
delete_threshold |
1 | 5 | Extra segment retention |
-force_key_frames |
✅ Present | ❌ Missing | Enables fast playback start |
-bsf:v h264_mp4toannexb |
✅ Present | ❌ Missing | HLS container compatibility |
Motivation: Different cameras/locations have different requirements:
Implementation: Extended cameras.json to support HLS parameter overrides:
{
"REOLINK_TERRACE": {
"name": "CAM_TERRACE",
"type": "reolink",
"host": "192.168.10.89",
"hls_mode": "copy",
"hls_time": "1", // Per-camera override
"hls_list_size": "1", // Per-camera override
"hsl_delete_threshold": "1", // Per-camera override (typo preserved for compatibility)
"preset": "veryfast", // Only used if hls_mode=transcode
"resolution_main": "1280x720", // Fullscreen resolution
"resolution_sub": "320x180" // Grid view resolution
}
}
Configuration Priority Cascade:
def get_ffmpeg_output_params(self, stream_type: str = 'sub', camera_config: Dict = None):
"""
Four-tier configuration priority:
1. camera_config[key] # cameras.json per-camera override
2. self.vendor_config[key] # config/reolink.json vendor default
3. os.getenv(REOLINK_KEY) # .env environment variable
4. hardcoded_default # Fallback value
"""
segment_length = int(
(camera_config or {}).get('hls_time') or
self.vendor_config.get('hls', {}).get('segment_length') or
os.getenv('REOLINK_HLS_SEGMENT_LENGTH', '2')
)
Updated:
streaming/handlers/reolink_stream_handler.py:
camera_config parameter to get_ffmpeg_output_params()-bsf:v h264_mp4toannexb for copy mode-force_key_frames and -sc_threshold for transcode modeConfiguration:
config/cameras.json: Added per-camera HLS tuning parameters for 17 cameras.env: Reolink-specific environment variables now act as fallback defaults1. Network Equipment Can Silently Corrupt Streaming Protocols:
2. Identical Hardware ≠ Identical Network Behavior:
3. FFmpeg Parameter Sensitivity:
-bsf:v h264_mp4toannexb can cause HLS playback failures in some browsers-force_key_frames is critical for low-latency HLS (sub-5 second)hls_delete_threshold creates exponential latency increase (2s segments × 5 threshold = 10s added delay)4. Configuration Hierarchy Enables Flexibility:
.env) for baseline behaviorconfig/reolink.json) for brand-specific tuningcameras.json) for special cases (outdoor, low-bandwidth, etc.)5. Sub-Second Latency Not Achievable with Standard HLS:
17-Camera Deployment:
Server Performance (Dell R730xd):
streaming/ffmpeg_params.py) to eliminate code duplication across handlers while preserving separation of concernsSession completed: October 6, 2025 ~3:30 PM Status: Reolink integration somewhat functional, per-camera tuning operational, latency optimization in progress.
Motivation: All three stream handlers (Eufy, Reolink, UniFi) contained ~100 lines of identical FFmpeg parameter generation logic, violating DRY principle.
Implementation:
Created streaming/ffmpeg_params.py - Pure function module with zero dependencies:
def get_ffmpeg_output_params(
stream_type: str = 'sub',
camera_config: Optional[Dict] = None,
vendor_config: Optional[Dict] = None,
vendor_prefix: str = '',
) -> List[str]:
"""
Generate FFmpeg HLS output parameters with four-tier configuration priority.
Supports both copy mode (direct stream) and transcode mode (re-encode).
"""
Handler Simplification:
Each handler’s get_ffmpeg_output_params() method reduced from ~100 lines to 5 lines:
# In reolink_stream_handler.py, eufy_stream_handler.py
def get_ffmpeg_output_params(self, stream_type: str = 'sub', camera_config: Dict = None):
return get_ffmpeg_output_params(
stream_type=stream_type,
camera_config=camera_config,
vendor_config=self.vendor_config,
vendor_prefix='REOLINK_' # or 'EUFY_'
)
Benefits:
Files Modified:
streaming/ffmpeg_params.py - Created (150 lines)streaming/handlers/reolink_stream_handler.py - Reduced to ~80 linesstreaming/handlers/eufy_stream_handler.py - Reduced to ~80 linesNext: Apply same pattern to UniFi handler in subsequent session.
Summary: Massive architectural refactor of camera streaming pipeline to fully de-vendorize FFmpeg param handling, centralize per-camera configuration, and add new RTMP/FLV low-latency streaming support.
Introduced per-camera cameras.json config:
rtsp_input and rtsp_output sections.-i).ffmpeg_names_map: added "maps": "map".-map flags to rtsp_output block (output-only option).stream_manager.py /_start_ffmpeg() for live FFmpeg error logging.synchronous _start_stream() execution to maintain Flask 500/200 consistency.
(Threaded async launch postponed; will revisit when UI polling ready.)setupLayout() not executing).
Fixed by creating proper ES6 module flv-stream.js and re-enabling imports in streams.html.ffplay rtmp://… (<1 s latency).Added backend route:
@app.route('/api/camera/<camera_serial>/flv')
def serve_camera_flv(...):
ffmpeg -i rtmp://... -c copy -f flv -
→ streams via HTTP as video/x-flv (≈ 500–800 ms latency).
flv-stream.js using flv.js player.
Integrated into stream.js as RTMP mode toggle.streams.html to include flv.js and flv-stream modules.ffmpeg_params.py.Next Steps
Got it. Based on current trajectory — the FFmpeg latency tests, RTMP/FLV attempt, and the discovery that Ubuntu 24.04 (with FFmpeg ≥6.1) is required for LL-HLS/WebRTC experiments — here’s what I’ll append to README_project_history.md under a “Next Steps / Migration Plan” section:
/api/camera/<camera_serial>/flv route in app.py to test RTMP → FLV proxying using FFmpeg (-c copy -f flv -).No Transcoding ≠ Zero Latency
-c copy, FFmpeg introduces buffering and GOP alignment delay.Native Reolink Streams Are Faster
Flask Threading Limitation
while read() loop to a separate thread prevents blocking but doesn’t reduce buffering.Protocol Trade-off
| Target | Rationale |
|---|---|
| Migrate server OS: Debian 12 → Ubuntu 24.04 LTS | FFmpeg ≥ 6.1 required for LL-HLS and improved RTSP reconnection handling. |
| Adopt WebRTC bridge (mediamtx) | Enables 200–500 ms real-time latency for Reolink/UniFi cameras in browser. |
| Maintain HLS path for stability | LL-HLS on FFmpeg 6.1 offers ~0.8–1.5 s latency with wide compatibility. |
| Retire FLV proxy | Kept only as a diagnostic tool; not suitable for production browser playback. |
Server Migration
WebRTC Prototype
mediamtx container.FFmpeg Modernization
-hls_time 0.5 -hls_flags append_list+split_by_time -tune zerolatency-listen 1 + -fflags nobuffer for push-based ingest.Codebase Updates
"stream_mode": "webrtc" in cameras.json./api/camera/<id>/webrtc endpoint calling mediamtx./api/.../flv as fallback.Got it! Here’s a new supplementary section to add after existing October 9-10 entry:
Migration Status: ✅ Complete
Successfully migrated Dell PowerEdge R730xd from Debian 12 to Ubuntu 24.04 LTS Server.
Key Software Versions Now Available:
FFmpeg 6.1 New Capabilities Unlocked:
-hls_start_number_source, improved segment handling)-tune zerolatency optimizationsMigration Notes:
cameras.jsonImmediate Testing Priorities:
HOPING FOR Baseline Performance (Ubuntu 24.04 + FFmpeg 6.1):
Next steps: test ffmpeg params to optimize latency. For now, after several hours, streams remain stuck in “Attempting to start…” queries that seem to lead nowhere.
UI restart logic must be improved. Seems that it gives up at some point. Should never give up. Increasing delays ok, but not stopping alltogether to try and restart a stream.
Stop/restart/start UI button not working when RTMP because for now we don’t have a dedicated module implemented (just a stupid API route): RTMP must be integrated like other types.
Issue: current architecture works based on vendor logic: if eufy, if unifiy, if reolink… not “if rtmp, else if rtsp else if mjpeg etc.”
Ubuntu and ffmpeg 6 migration seem to have made things worse latency-wise. Probably params to be adjusted in cameras.json.
Critical debugging session following Ubuntu 24.04 + FFmpeg 6.1.1 migration that caused widespread stream freezing. Root cause identified as TCP RTSP transport incompatibility with FFmpeg 6’s stricter buffering behavior. Implemented per-camera UI health monitor control via cameras.json configuration.
if eufy/unifi/reolink to if rtmp/rtsp/mjpegSymptoms:
[ffmpeg] <defunct>REOLINK_OFFICE using UDP transport continued workingRoot Cause Analysis:
# FAILING (TCP - all Eufy, most Reolink, UniFi):
ffmpeg -rtsp_transport tcp -fflags nobuffer -flags low_delay ...
# Result: Process hangs after ~3 minutes, stops producing segments
# WORKING (UDP - REOLINK_OFFICE only):
ffmpeg -rtsp_transport udp ...
# Result: Stable streaming, 5-6 second latency
Evidence from logs:
REOLINK_TERRACE (192.168.10.89): Genuine hardware failure - Connection refusedTechnical Explanation:
FFmpeg 6.1.1 introduced stricter buffering behavior that conflicts with the combination of:
-rtsp_transport tcp (requires ACK for every packet)-fflags nobuffer -flags low_delay (disables buffering)-timeout 5000000 (5-second timeout)This creates a deadlock where FFmpeg waits for TCP acknowledgments that never arrive due to disabled buffering, causing the process to hang while remaining “alive” in process table.
UDP bypasses this because it’s connectionless - no ACK required, packet loss = dropped frames (acceptable for surveillance).
Issue: Eufy cameras freezing even faster than Reolink cameras
Root Cause:
"frame_rate_grid_mode": 5, // 5 fps in grid view
"g": 36, // GOP size 36 frames
"keyint_min": 36
Math reveals the problem:
-force_key_frames expr:gte(t,n_forced*2) expects keyframes every 2 secondsFix Applied:
"g": 10, // 5 fps × 2 seconds = 10 frames
"keyint_min": 10 // Match GOP size
Applied to all 9 Eufy cameras:
REOLINK_OFFICE had insane settings:
"hls_time": "0.1", // 100ms segments = 10 segments/second
"preset": "ultrafast",
"frame_rate_grid_mode": 6
Impact:
Corrected to:
"hls_time": "2", // 2-second segments (reasonable)
"preset": "medium", // Better quality/CPU balance
Symptoms:
Root Cause: Health monitor checking for:
But not accounting for:
Architecture Decision: Add granular control at camera level in cameras.json
Implementation:
1. Updated cameras.json structure:
{
"devices": {
"REOLINK_OFFICE": {
"name": "CAM OFFICE",
...
"ui_health_monitor": false // ← NEW: Per-camera control
},
"T8416P0023352DA9": {
"name": "Living Room",
...
"ui_health_monitor": true // ← Enabled (default)
}
},
"ui_health_global_settings": { // ← NEW: Centralized settings
"UI_HEALTH_BLANK_AVG": 2,
"UI_HEALTH_BLANK_STD": 5,
"UI_HEALTH_SAMPLE_INTERVAL_MS": 2000,
"UI_HEALTH_STALE_AFTER_MS": 20000,
"UI_HEALTH_CONSECUTIVE_BLANK_NEEDED": 10,
"UI_HEALTH_COOLDOWN_MS": 30000,
"UI_HEALTH_WARMUP_MS": 300000 // 5 minutes warmup
}
}
2. Modified app.py - Enhanced _ui_health_from_env():
Added support for loading global settings from cameras.json with priority:
cameras.json > .env > defaults
def _ui_health_from_env():
"""
Build UI health settings dict from environment variables AND cameras.json global settings.
Priority: cameras.json > .env
"""
# Start with .env defaults
settings = { ... }
# Override with cameras.json global settings if they exist
try:
global_settings = camera_repo.cameras_data.get('ui_health_global_settings', {})
if global_settings:
# Map uppercase keys to camelCase
...
except Exception as e:
print(f"Warning: Could not load global UI health settings: {e}")
return settings
3. Modified streams.html - Added data attribute:
<div class="stream-item"
data-camera-serial="{{ serial }}"
data-camera-name="{{ info.name }}"
data-camera-type="{{ info.type }}"
data-stream-type="{{ info.stream_type }}"
data-ui-health-monitor="{{ info.get('ui_health_monitor', True)|lower }}"> <!-- NEW -->
4. Modified static/js/streaming/health.js - Early exit for disabled cameras:
function attachHls(serial, $videoOrDom, hlsInstance = null) {
// Check if health monitoring is enabled for this camera
const $streamItem = $(`.stream-item[data-camera-serial="${serial}"]`);
const healthEnabled = $streamItem.data('ui-health-monitor');
if (healthEnabled === false || healthEnabled === 'false') {
console.log(`[Health] Monitoring disabled for ${serial}`);
return () => {}; // Return empty cleanup function - no monitoring
}
// ... rest of existing code
}
function attachMjpeg(serial, $imgOrCanvas) {
// Same check added here
...
}
Benefits:
cameras.jsonModified all 9 Eufy camera configurations in cameras.json:
"rtsp_output": {
"g": 10, // Changed from 36
"keyint_min": 10, // Changed from 36
...
}
Expected Result: Eufy cameras should maintain stable streams without freezing
Fixed REOLINK_OFFICE extreme settings:
"rtsp_output": {
"hls_time": "2", // Changed from "0.1"
"preset": "medium", // Changed from "ultrafast"
...
}
After GOP fix + parameter normalization:
Observed Behavior:
Zombie Processes: Still present from previous sessions - requires system cleanup:
pkill -9 ffmpeg # Clear all zombie processes
Status: Not started
Requirements:
Location: static/js/streaming/stream.js - restartStream() function
Status: Not started
Current State:
/api/camera/<camera_serial>/flv routestart_stream() / stop_stream() / restart_stream() workflowRequired Changes:
streaming/handlers/StreamManager as another stream typeStatus: Not started (architectural change)
Current Problem:
if camera_type == 'eufy':
handler = EufyStreamHandler()
elif camera_type == 'unifi':
handler = UniFiStreamHandler()
elif camera_type == 'reolink':
handler = ReolinkStreamHandler()
Desired Architecture:
protocol = camera_config.get('protocol', 'rtsp') # rtsp, rtmp, mjpeg, etc.
if protocol == 'rtsp':
handler = RTSPStreamHandler()
elif protocol == 'rtmp':
handler = RTMPStreamHandler()
elif protocol == 'mjpeg':
handler = MJPEGStreamHandler()
Benefits:
Status: Partially addressed (per-camera disable), core logic needs improvement
Remaining Issues:
Required Fixes:
Location: static/js/streaming/health.js - markUnhealthy() function
Status: Not achieved (current: 5-6 seconds)
Goal: Sub-second or near sub-second latency
Why Ubuntu/FFmpeg 6 Migration:
Next Steps for Low Latency:
Option A: LL-HLS (FFmpeg 6.1+)
"rtsp_output": {
"hls_time": "0.5", // 500ms segments
"hls_list_size": "3", // Minimal playlist
"hls_flags": "independent_segments+split_by_time",
"hls_segment_type": "fmp4", // Fragmented MP4
"hls_fmp4_init_filename": "init.mp4",
"tune": "zerolatency",
"preset": "ultrafast"
}
Expected latency: 1.5-2 seconds
Option B: WebRTC (via mediamtx)
mediamtx container alongside FlaskOption C: RTMP Direct (Already partially implemented)
/api/camera/<serial>/flv route
Expected latency: 500-800ms (tested, but Flask proxy adds overhead)Recommendation: Test LL-HLS first (easiest integration), then WebRTC if needed.
GOP = FPS × keyframe_interval_secondsConfiguration:
config/cameras.json - Added ui_health_monitor per camera, added ui_health_global_settings section, updated Eufy GOP parameters (g: 10, keyint_min: 10)config/cameras.json - Set REOLINK_TERRACE to "hidden": true (hardware failure)Backend:
app.py - Enhanced _ui_health_from_env() to load global settings from cameras.json with priority systemFrontend:
templates/streams.html - Added data-ui-health-monitor attribute to stream itemsstatic/js/streaming/health.js - Added per-camera health monitor enable/disable check in attachHls() and attachMjpeg()Working Cameras (10/17):
Known Issues:
Performance:
High Priority (Stability):
pkill -9 ffmpeg then proper reaping in codeMedium Priority (Features):
Low Priority (Architecture):
Session completed: October 11, 2025, 18:30
Status: Major stability improvements implemented, per-camera health control working, Eufy GOP fixed
Next Session: Validate Eufy stability, test LL-HLS for sub-second latency goal
Following the successful implementation of:
start_stream() (pre-reservation of active_streams slots),health.js,cameras.json,…new symptoms emerged in the UI layer:
/api/streams/<serial>/playlist.m3u8 endpoints were active.UI Status Logic Trace
restartStream() sets "live" only for HLS and MJPEG, not for RTMP.streamType: "RTMP" falls through and never executes a "live" status update.onUnhealthy callback compounded this: once a stream was marked “failed,” there was no later status reconciliation after a successful restart.Server-side Validation
/tmp/streams/...).is_stream_alive() correctly returned True; bug was purely front-end.File: static/js/streaming/stream.js
// PATCHED restartStream()
async restartStream(serial, $streamItem) {
try {
console.log(`[Restart] ${serial}: Beginning restart sequence`);
this.updateStreamButtons($streamItem, true);
this.setStreamStatus($streamItem, 'loading', 'Restarting...');
const cameraType = $streamItem.data('camera-type');
const streamType = $streamItem.data('stream-type').upper();
const videoElement = $streamItem.find('.stream-video')[0];
if (videoElement && videoElement._healthDetach) {
videoElement._healthDetach();
delete videoElement._healthDetach;
}
if (streamType === 'HLS' || streamType === 'LL_HLS' || streamType === 'NEOLINK' || streamType === 'NEOLINK_LL_HLS') {
await this.hlsManager.forceRefreshStream(serial, videoElement);
this.setStreamStatus($streamItem, 'live', 'Live');
} else if (streamType === 'mjpeg_proxy' || streamType === 'RTMP') { // ✅ unified branch
await this.stopIndividualStream(serial, $streamItem, cameraType, streamType);
await new Promise(r => setTimeout(r, 1500));
await this.startStream(serial, $streamItem, cameraType, streamType);
this.setStreamStatus($streamItem, 'live', 'Live'); // ✅ ensure UI sync
}
console.log(`[Restart] ${serial}: Restart complete`);
} catch (e) {
console.error(`[Restart] ${serial}: Failed`, e);
this.setStreamStatus($streamItem, 'error', 'Restart failed');
}
}
stopIndividualStream() and forceRefreshStream() signatures to reduce redundancy.Here’s the next block to append to README_project_history.md (same tone/structure as recent entries). I’ve included precise references to where the bugs/behaviors showed up in the code so we can trace later.
active_streams with process=None. Subsequent checks called is_stream_alive() and crashed on process.poll() because process wasn’t set yet. This manifested while hitting the public start route which calls start_stream() and immediately checks actives【turn5file6】.restartStream() has HLS + MJPEG branches only【turn5file4】, while the health module exports attachHls/attachMjpeg (no RTMP hook)【turn5file11】.onUnhealthy with exponential retry)【turn5file2】【turn5file2】, but without RTMP attach the monitor can’t validate recovery on FLV tiles.Start-while-starting guard
In start_stream():
status=="starting", return the playlist URL immediately (don’t call is_stream_alive() yet).is_stream_alive() for fully initialized entries.
This prevents process=None from ever reaching .poll() during warm-up【turn5file6】.is_stream_alive() resilience
Safely handle:
status=="starting"process is None
And wrap .poll() in a small try so a weird process object can’t crash the call.Result: the “AttributeError: ‘NoneType’ object has no attribute ‘poll’” is eliminated during startup storms.
Add RTMP health attach
Implemented attachRTMP(serial, videoEl, flvInstance) in health.js and kept the existing detach(serial) API. Export now includes RTMP as well:
return { attachHls, attachMjpeg, attachRTMP, detach }.
Prior state only exported HLS/MJPEG【turn5file11】.
Wire RTMP health after successful start
In startStream(), after success:
attachHls(...) (existing)flvManager and call attachRTMP(...)attachMjpeg(...) (existing)
(Your HLS/MJPEG wiring already existed here【turn5file9】; we added the symmetric RTMP branch.)RTMP restart path uses full teardown + explicit status reconciliation
In restartStream():
flvManager.stopStream(serial) → brief delay → startStream(...) again; then force-check the <video> element and set “Live” if it’s actually playing so we don’t keep a stale “Failed” badge lingering.
Previously, only HLS called forceRefreshStream() and set “Live”; MJPEG did stop+start; RTMP had no explicit branch in one of the code paths【turn5file4】.Stop/StopAll now include RTMP consistently
stopIndividualStream() and stopAllStreams() already have RTMP in the current version (flv manager) — confirmed and kept【turn5file10】【turn5file10】.
onUnhealthy exponential backoff wiring (attempts/timers) in the constructor【turn5file2】【turn5file2】.forceRefreshStream, MJPEG just restarts) — hence the symmetric but protocol-specific branches in restartStream().AttributeError during “starting” windows.stream_type not defined) in _watchdog_restart_stream by deriving a local stream_type (from per-camera state) before passing it, or omit the kwarg to use handler defaults (call-site issue, not callee defaults).<defunct> FFmpeg PIDs; when reaping dead processes outside stop_stream(), call communicate() before dropping the ref (this was already captured in earlier “zombie” hygiene notes【turn5file14】).RTMP streams still show failed no matter what… dear or alive, show failed, despite updates. Actually shows “live” for a second or 2 then switches to failed. So something down the road does that. Backend Watchdog needs updating due to many refactorings: varibles not passed where they should etc. WATCHDOG disabled for now.
I’ll update the README with the current state and the critical issues we’re facing.I need to read the README first to append to it properly. Let me check the file structure:Let me check what files are available:Based on the issues we’ve described, here’s what needs to be documented:
Goal: Integrate RTMP streams into StreamManager for unified process tracking and lifecycle management.
Changes Made:
reolink_stream_handler.py:
build_rtsp_url() to check camera.get('stream_type')_build_rtmp_url() method for RTMP URL constructionget_ffmpeg_input_params() to return minimal params for RTMP (no -rtsp_transport)stream_manager.py._start_stream():
camera.get('stream_type', 'HLS').upper()ffmpeg -i rtmp://... -c copy -f flv - → outputs to stdoutactive_streams with 'protocol': 'rtmp' flag/api/camera/<serial>/flv URL for RTMP streamsapp.py route /api/camera/<serial>/flv:
StreamManager.active_streamswith stream_manager._streams_lock: to safely read processResult:
active_streams (unified tracking)stop_stream() works for RTMP (kills process, removes from dict)Critical Bug Fixed:
# WRONG (was causing "Input/output error"):
rtmp_url = f"rtmp://{host}:1935/...&password={quote(password, safe='')}"
# Result: password=xxxxxxxxxxxxxxxxxxxxxxx
# CORRECT:
rtmp_url = f"rtmp://{host}:1935/...&password={password}"
# Result: password=TarTo56))#FatouiiDRtu
RTMP doesn’t use URL encoding like RTSP does. Special characters work as-is in RTMP query parameters.
Status: 🔴 BLOCKING - System Unusable
Symptoms:
Zombie FFmpeg Processes:
elfege 2383980 0.0 0.0 0 0 ? Zs 01:57 0:01 [ffmpeg]
elfege 2383993 0.0 0.0 0 0 ? Zs 01:57 0:01 [ffmpeg]
elfege 2384077 0.0 0.0 0 0 ? Zs 01:57 0:01 [ffmpeg]
# ... 9 zombie processes total
Z) and never get reapedwait() on terminated children_kill_all_ffmpeg_for_camera() not catching all processesRoot Causes (Suspected):
Threading Race Conditions:
# In start_stream():
with self._streams_lock:
# Reserve slot
self.active_streams[camera_serial] = {'status': 'starting'}
# Start thread WITHOUT lock
threading.Thread(target=self._start_stream, ...).start()
# Thread may not acquire lock before another request comes in
Zombie Process Creation:
# In _start_ffmpeg():
process = subprocess.Popen(cmd, start_new_session=True)
# start_new_session=True detaches from parent
# When process dies, becomes zombie until parent calls wait()
# But we never explicitly wait() on terminated processes
_watchdog_restart_stream() which does stop_stream() + _start_ffmpeg()Attempted Fixes (All Failed):
_streams_lock for thread safety (still races)'status': 'starting' (still duplicates)start_new_session=True for process isolation (creates zombies)_kill_all_ffmpeg_for_camera() with pkill -9 (misses some)1. Fix Zombie Process Reaping (CRITICAL)
Add process reaper thread or signal handler:
import signal
def reap_zombies(signum, frame):
"""Reap all zombie child processes"""
while True:
try:
pid, status = os.waitpid(-1, os.WNOHANG)
if pid == 0:
break
logger.debug(f"Reaped zombie process {pid}")
except ChildProcessError:
break
# Register signal handler
signal.signal(signal.SIGCHLD, reap_zombies)
2. Fix Stream Restart Logic
Current issue: stop_stream() doesn’t wait for process termination:
def stop_stream(self, camera_serial: str):
# Kill process
self._kill_all_ffmpeg_for_camera(camera_serial)
# Remove from dict IMMEDIATELY (wrong!)
self.active_streams.pop(camera_serial, None)
# Process might still be dying when restart happens
Should be:
def stop_stream(self, camera_serial: str):
process = self.active_streams[camera_serial]['process']
# Terminate gracefully
process.terminate()
# WAIT for it to die (timeout 5s)
try:
process.wait(timeout=5)
except subprocess.TimeoutExpired:
process.kill()
process.wait()
# NOW remove from dict
self.active_streams.pop(camera_serial, None)
3. Fix Frontend HLS.js Cache
When restarting streams, frontend MUST destroy and recreate HLS.js instance:
// In hls-stream.js forceRefreshStream():
const existingHls = this.hlsInstances.get(cameraId);
if (existingHls) {
existingHls.destroy(); // Clears internal cache
this.hlsInstances.delete(cameraId);
}
// Clear video element
videoElement.src = '';
videoElement.load();
// Wait before restart
await new Promise(resolve => setTimeout(resolve, 1000));
// NOW restart
this.startStream(cameraId, videoElement);
4. Disable Watchdog Entirely (Temporary)
Until restart logic is fixed:
export ENABLE_WATCHDOG=false
5. Add Process Cleanup on Startup
# In StreamManager.__init__():
self._cleanup_orphaned_ffmpeg()
def _cleanup_orphaned_ffmpeg(self):
"""Kill all FFmpeg processes on startup"""
subprocess.run(['pkill', '-9', 'ffmpeg'], stderr=subprocess.DEVNULL)
time.sleep(2)
Current State: System is fundamentally broken. Threading model and process lifecycle management need complete redesign.
Session ended: October 11, 2025 02:34 AM
Status: 🔴 RTMP partially integrated but system-wide critical failures block all progress
Systematic diagnosis of FFmpeg streams freezing after 15-20 minutes on both Dell R730xd (RAID SAS) and Ryzen 7 5700X3D (NVMe) servers. Root cause isolated to conflicting FFmpeg parameters when using -c:v copy mode with transcoding filters. All cameras (Eufy TCP, Reolink UDP, UniFi TCP) exhibited identical freeze pattern at ~109 segments regardless of hardware.
Pattern Identified:
Initial Hypothesis (Incorrect): Disk I/O Bottleneck on Dell Server
Tested Hypotheses (All Ruled Out):
-use_wallclock_as_timestamps duplication (input + output params)-hls_flags append_list without delete_segmentsRoot Cause Identified: FFmpeg Parameter Conflict
# The Problem Command
ffmpeg -rtsp_transport tcp -i rtsp://... \
-c:v copy \ # ← Copy mode (no re-encoding)
-vf scale=320:180 \ # ← CONFLICT: Can't filter copied stream
-r 5 \ # ← CONFLICT: Can't change framerate in copy mode
-profile:v baseline \ # ← CONFLICT: Encoder param with no encoder
-tune zerolatency \ # ← CONFLICT: Encoder param with no encoder
-g 10 -keyint_min 10 \ # ← CONFLICT: GOP settings with no encoder
...
FFmpeg Error:
[vost#0:0/copy @ 0x62fb8df8fc80] Filtergraph 'scale=320:180' was specified,
but codec copy was selected. Filtering and streamcopy cannot be used together.
Error opening output file: Function not implemented
Problematic Config:
"rtsp_output": {
"c:v": "copy", // Copy mode enabled
"profile:v": "baseline", // Invalid with copy
"pix_fmt": "yuv420p", // Invalid with copy
"resolution_sub": "320x180", // Triggers -vf scale (invalid with copy)
"frame_rate_grid_mode": 5, // Triggers -r (invalid with copy)
"tune": "zerolatency", // Invalid with copy
"g": 10, // Invalid with copy
...
}
Fix Applied:
"rtsp_output": {
"c:v": "copy",
"profile:v": "N/A", // Builder skips "N/A" values
"pix_fmt": "N/A",
"resolution_sub": "N/A",
"frame_rate_grid_mode": "N/A",
"tune": "N/A",
"g": "N/A",
"keyint_min": "N/A",
"preset": "N/A",
"f": "hls",
"hls_time": "2",
"hls_list_size": "3",
"hls_flags": "delete_segments",
"hls_delete_threshold": "1"
}
File: 0_MAINTENANCE_SCRIPTS/diagnose_ffmpeg.sh
Comprehensive diagnostic suite with 9 test categories:
Usage:
chmod +x 0_MAINTENANCE_SCRIPTS/diagnose_ffmpeg.sh
./0_MAINTENANCE_SCRIPTS/diagnose_ffmpeg.sh
# Generates timestamped log: diagnostic_YYYYMMDD_HHMMSS.log
FFmpeg Copy Mode Requirements:
-c:v copy means no re-encoding - stream passes through untouched-vf), encoder setting (-preset, -tune, -profile), or frame manipulation (-r, -g)-f hls, -hls_time, etc.)TCP Recv-Q Analysis:
Hardware Migration Results:
config/cameras.json - Set all transcoding parameters to “N/A” for copy mode cameras0_MAINTENANCE_SCRIPTS/diagnose_ffmpeg.sh - Created comprehensive diagnostic toolSession Status: Root cause identified and fixed, awaiting validation testing
Next Session: Confirm stream stability, optimize latency if copy mode works, consider transcode mode for resolution control
Symptoms:
1. Parameter Positioning Issues:
-fflags +genpts from rtsp_output to rtsp_input (correct fix, but not root cause)-i → output params2. Frame Rate Mismatch:
-r 8 while JSON had "r": 303. Loglevel Addition:
-loglevel repeat+level+verbose to match bashThe Bug:
# stream_manager.py _start_ffmpeg()
process = subprocess.Popen(
cmd,
stdout=subprocess.PIPE, # ← CAPTURING without reading!
stderr=subprocess.PIPE, # ← CAPTURING without reading!
)
What Happens:
-loglevel verbose)Why Bash Worked:
# Bash script - no capture
ffmpeg ... > /dev/null 2>&1 # Or no redirection at all
# Output goes to terminal/null, never fills buffer
Option 1: Discard Output (Recommended)
process = subprocess.Popen(
cmd,
stdout=subprocess.DEVNULL, # Don't capture
stderr=subprocess.DEVNULL, # Don't capture
)
Option 2: Redirect to File (For Debugging)
log_file = open(f'/tmp/ffmpeg_{camera_serial}.log', 'w')
process = subprocess.Popen(
cmd,
stdout=log_file,
stderr=log_file
)
# Remember to close log_file later or use context manager
Option 3: Read in Background Thread (Complex)
# Only if we NEED to process FFmpeg output in real-time
# Requires threading.Thread reading from process.stdout/stderr
After applying subprocess.DEVNULL:
-loglevel verbose (previously broke immediately)Evidence:
# Python FFmpeg (with DEVNULL fix)
elfege 3152041 4.2 0.3 2141660 99364 pts/7 SLl+ 01:54 0:09 ffmpeg ...
# Playlist continuously updating
#EXTM3U
#EXT-X-VERSION:3
#EXT-X-TARGETDURATION:1
#EXT-X-MEDIA-SEQUENCE:178
#EXTINF:1.250000,
segment_178.ts
Critical Python Subprocess Gotcha:
subprocess.PIPE creates a fixed-size buffer (typically 64KB on Linux)Why It’s Subtle:
Best Practices:
subprocess.DEVNULL if we don’t need outputPIPE without reading from itstreaming/stream_manager.py:
stdout=subprocess.PIPE → stdout=subprocess.DEVNULLstderr=subprocess.PIPE → stderr=subprocess.DEVNULLBefore:
After:
Session completed: October 13, 2025 ~2:00 AM
Status: Critical deadlock resolved, streaming stable, root cause documented
Key Takeaway: subprocess.PIPE + no reading = inevitable deadlock
Every 1.0s: cat streams/REOLINK_OFFICE/playlist.m3u8 server: Mon Oct 13 09:15:37 2025
#EXTM3U
#EXT-X-VERSION:3
#EXT-X-TARGETDURATION:1
#EXT-X-MEDIA-SEQUENCE:19875
#EXTINF:1.250000,
segment_19875.ts
Every 1.0s: cat streams/REOLINK_OFFICE/playlist.m3u8 server: Mon Oct 13 09:23:58 2025
- elfege 3249544 17.5 0.6 2304616 199404 pts/2 SLl+ 02:20 74:25 ffmpeg -rtsp_transport tcp -timeout 5000000
- elfege 3249576 21.0 0.6 2304564 201964 pts/2 SLl+ 02:20 89:14 ffmpeg -rtsp_transport tcp -timeout 5000000
- elfege 3276746 4.6 0.3 2141716 104488 pts/2 SLl+ 02:29 19:28 ffmpeg -rtsp_transport udp -timeout 5000000
timelapse
Every 0.1s: ps aux | grep ffmpeg server: Mon Oct 13 09:32:28 2025
- elfege 3249544 17.5 0.6 2304616 199404 pts/2 SLl+ 02:20 75:31 ffmpeg -rtsp_transport tcp -timeout 5000000
- elfege 3249576 21.0 0.6 2304564 202220 pts/2 SLl+ 02:20 90:32 ffmpeg -rtsp_transport tcp -timeout 5000000
- elfege 3276746 4.6 0.3 2141716 104488 pts/2 SLl+ 02:29 19:46 ffmpeg -rtsp_transport udp -timeout 5000000 -
3 streams have been stable all night: REOLINK_OFFICE, T8441P122428038A (EUFY Hot Tub) & T8416P0023352DA9 (Living Room)
Are frozen: All others except Kids room (disconnected) & Laundry Room (disconnected)
Hit restart button in U.I.:
Note: UI health probably far too complex anyway. Simple timeout => restart api call (should do stop & start) with every 600s would be a better band aid.
Successful Long-Run Validation (7+ hours):
Frozen Cameras:
Issue Discovered: Missing fflags Parameter
Only REOLINK_OFFICE had "fflags": "+genpts" in rtsp_input section. Based on October 12 findings that fflags must be in input params (not output) to prevent segmentation freezing, this was identified as root cause for frozen streams.
Fix Applied:
"fflags": "+genpts" to rtsp_input section of all 17 camerasUnintended Configuration Change:
During bulk fflags addition, accidentally changed all Eufy cameras from "rtsp_transport": "tcp" to "rtsp_transport": "udp".
Why This Broke Everything:
Eufy cameras require TCP for RTSP authentication:
Immediate Impact on Restart:
❌ Failed to start stream for Living Room: Failed to start FFmpeg: 'NoneType' object has no attribute 'decode'
❌ Failed to start stream for Kids Room: Failed to start FFmpeg: 'NoneType' object has no attribute 'decode'
❌ Failed to start stream for Kitchen: Failed to start FFmpeg: 'NoneType' object has no attribute 'decode'
[... all Eufy cameras failed ...]
Correct Transport Protocol Matrix:
| Camera Type | Protocol | Reason |
|---|---|---|
| Eufy (T8416, T8419, T8441*) | TCP | Authentication required |
| UniFi (68d49398…) | TCP | Protect proxy requires TCP |
| Reolink (REOLINK_*) | UDP | Better packet loss handling outdoors |
Problem:
Yesterday’s fix (changing subprocess.PIPE → subprocess.DEVNULL to prevent deadlock) broke error capture logic:
# stream_manager.py _start_ffmpeg()
process = subprocess.Popen(
cmd,
stdout=subprocess.DEVNULL, # ← No longer capturing
stderr=subprocess.DEVNULL,
)
# Error handling assumed stderr capture exists
if process.poll() is not None:
stdout, stderr = process.communicate() # ← stderr is None!
print(stderr.decode('utf-8')) # ← AttributeError: 'NoneType' object has no attribute 'decode'
Impact:
Fix Applied:
if process.poll() is not None:
print("════════ FFmpeg died immediately ════════")
print(f"FFmpeg exit code: {process.returncode}")
print("Command was:")
print(' '.join(cmd))
print("════════════════════════════════")
raise Exception(f"FFmpeg died with code {process.returncode}")
1. Case Sensitivity Issue - REOLINK_LAUNDRY:
"REOLINK_LAUNDRY": {
"stream_type": "hls", // ← Lowercase (all others uppercase "HLS")
Impact: If Python code uses case-sensitive checks (== 'HLS'), LAUNDRY ROOM buttons (PLAY/STOP/RESTART) would fail silently.
2. Typo - REOLINK_TERRACE:
"REOLINK_TERRACE": {
"stream_type": "HSL", // ← Typo (should be "HLS")
Impact: Stream type validation failures, incorrect protocol routing.
config/cameras.json:
"fflags": "+genpts" to all cameras’ rtsp_input"hls" → "HLS""HSL" → "HLS"streaming/stream_manager.py:
Critical Configuration Management Issues:
The Cascade Effect:
Missing fflags → Streams freeze after minutes
↓
Add fflags to all cameras (good fix!)
↓
Accidentally change TCP → UDP (bulk edit mistake)
↓
All Eufy cameras fail authentication
↓
subprocess.DEVNULL prevents diagnosis
↓
Error handler crashes trying to decode None
↓
Cannot determine real FFmpeg error
Working:
Broken:
Required Actions:
"rtsp_transport": "tcp""stream_type": "HLS" for LAUNDRY (not “hls”)"stream_type": "HLS" for TERRACE (not “HSL”)The overnight stability test proved the October 12 fix works:
The bulk configuration update introduced new bugs but validated the core fix. With TCP/UDP corrected and case sensitivity fixed, all cameras should achieve the same stability as REOLINK_OFFICE.
Session completed: October 13, 2025 ~11:30 AM
Streams stable several hours later.
I’ll add today’s session to the README:
Complete overhaul of frontend health monitoring system after discovering critical bugs and overcomplicated architecture. Health monitor was non-functional due to configuration key mismatch, then after fixes revealed browser-based monitoring limitations. Simplified from 3 protocol-specific methods to single unified approach.
Issue Discovered: Health monitor showing “DISABLED” despite configuration set to enabled
Root Cause: Key mismatch between backend and frontend
# app.py - returning wrong key
settings = {
'enabled': _get_bool("UI_HEALTH_ENABLED", True), # ← lowercase
...
}
# stream.js - checking different key
if (H.uiHealthEnabled) { // ← camelCase
Fix Applied: Changed backend to return 'uiHealthEnabled' matching frontend expectations
1. Early Return Bug in attachMjpeg()
2. Overly Complex Stale Detection
// Broken logic - never triggered restarts
if (staleDuration > threshold) {
if (hasError || networkState === 3 || (isPaused && staleDuration > threshold * 2)) {
markUnhealthy(); // ← Only if ALSO has explicit error
} else {
console.log("appears OK - waiting..."); // ← Waited forever
}
}
Streams frozen for 20+ seconds but no explicit error → health monitor never restarted them
3. The “All Cameras Stale” Pattern
Critical realization from user observation:
T8416P0023352DA9: staleDuration=19.5s
T8416P0023370398: staleDuration=17.3s
68d49398005cf203e400043f: staleDuration=18.3s
T8416P00233717CB: staleDuration=17.3s
// ALL cameras 17-19s simultaneously
User’s insight: “If ALL cameras are stale at once, that’s not 10 stream failures - that’s the monitor breaking.”
Reality check: User could visually see REOLINK_OFFICE was actively streaming (pointing at them). Health monitor was broken, not the streams.
Historical context: Streams were stable for HOURS with health monitor disabled. FFmpeg freezing issues were already fixed in October 12 session.
Original Design (health.js had become):
attachHls(), attachRTMP(), attachMjpeg()User’s assessment: “I let we build this without supervision and we overcomplicated it.”
Questions posed:
Design Principles:
attach() method for all stream types<video> or <img> elementstaleAfterMs → restartconsecutiveBlankNeeded samples → restartImplementation:
export class HealthMonitor {
attach(serial, element) {
// Works for video/img, HLS/RTMP/MJPEG
startTimer(serial, () => {
if (warmup) return;
const sig = frameSignature(element);
if (sig !== lastSig) {
lastSig = sig;
lastProgressAt = now();
}
if (now() - lastProgressAt > staleAfterMs) {
markUnhealthy(serial, 'stale');
}
});
}
}
API Compatibility: Kept attachHls(), attachRTMP(), attachMjpeg() as aliases to attach() for backwards compatibility with stream.js
Problem: Still overdetecting stale streams despite simplification
Suspected Causes:
requestAnimationFrame and timers when tab backgroundedperformance.now() keeps incrementingstaleDuration increases while video actually playingdrawImage() may fail silentlynull → no progress detectedsetInterval() not guaranteed to fire exactly on scheduleCurrent Configuration Issues:
"UI_HEALTH_SAMPLE_INTERVAL_MS": 30000 // ← 30 seconds between checks!
30-second intervals mean a frozen stream goes undetected for 30+ seconds, then takes another 30s to confirm stale.
1. Browser-Based Monitoring Has Inherent Limitations
2. Progressive Enhancement Trap
3. Configuration Matters More Than Code
4. User Observation Trumps Metrics
5. “Just Make It Work” vs “Make It Perfect”
Completely Rewritten:
static/js/streaming/health.js - Reduced from 3 methods to 1 unified approach, ES6 class + jQueryBug Fixes:
app.py - Fixed _ui_health_from_env() to return 'uiHealthEnabled' instead of 'enabled''UI_HEALTH_ENABLED' in cameras.json global settings handlerConfiguration:
config/cameras.json - Added ui_health_global_settings.UI_HEALTH_ENABLED: trueHealth Monitor:
Recommendations for Next Session:
Option A: Further tune frontend approach
Option B: Move to backend health monitoring (probably better)
/api/health/{serial} endpointImmediate Action:
"UI_HEALTH_SAMPLE_INTERVAL_MS": 3000, // 3 seconds, not 30
"UI_HEALTH_STALE_AFTER_MS": 15000 // 15 seconds = 5 failed samples
Session completed: October 13, 2025 11:30 PM
Status: Health monitor functional but needs backend implementation for reliability
Key Insight: Browser-based video monitoring fundamentally limited by tab focus, canvas security, timer precision
Scope: Reduce glass‑to‑glass latency for Reolink substream while staying within HLS (no parts).
Experiments & findings
FRAG_CHANGED + programDateTime (tiles + fullscreen).rtsp_transport=tcp, kept tiny probe windows, removed audio (map: ["0:v:0"]), and used -muxpreload 0 -muxdelay 0.#EXT-X-MAP).Working TS output proposal (kept here for reference) Use when we want minimum latency within “short‑segment HLS” (still not Apple LL‑HLS because no parts).
"rtsp_output": {
"map": ["0:v:0"],
"c:v": "libx264",
"profile:v": "baseline",
"pix_fmt": "yuv420p",
"r": 15,
"vf": "scale=640:480",
"tune": "zerolatency",
"g": 7,
"keyint_min": 7,
"preset": "ultrafast",
"vsync": 0,
"sc_threshold": 0,
"force_key_frames": "expr:gte(t,n_forced*0.5)",
"f": "HLS",
"hls_time": "0.5",
"hls_list_size": "1",
"hls_flags": "program_date_time+delete_segments+split_by_time",
"hls_delete_threshold": "1"
}
Notes
hls_segment_type, hls_fmp4_init_filename, movflags) to avoid MP4 fragment overhead.hls_time and enforce IDRs via force_key_frames for consistent cuts.lowLatencyMode: trueliveSyncDurationCount: 1, liveMaxLatencyDurationCount: 2maxLiveSyncPlaybackRate: 1.5backBufferLength: 10Current decision
Next possible steps (single‑hypothesis approach)
-vsync 0 and -sc_threshold 0 with fMP4 to see if we recover some of the TS gain without leaving fMP4.#EXT-X-PART) when feasible.Goal: set the stage for true LL-HLS (partials) while keeping existing HLS working.
TLS cert helper
0_MAINTENANCE_SCRIPTS/make_self_signed_tls.sh${HOME}/0_NVR (not "~") so certs land at certs/dev/{fullchain.pem,privkey.pem}.NGINX edge (HTTP/2)
docker-compose.yml
nvr-edge on ports 80 and 443, network nvr-net.nvr ports now bound to loopback: 127.0.0.1:5000:5000 (forces clients through edge).nginx/nginx.conf
server { listen 80; … return 301 https://… }.Added server { listen 443 ssl http2; … } with:
/etc/nginx/certs.http://nvr:5000.Low-latency passthrough blocks:
location ^~ /streams/ { … proxy_buffering off … } (legacy HLS from our app).location ^~ /hls/ { … } (proxy to packager; LL-HLS).Compose cleanup
docker-compose.yml; removed the override.depends_on: [nvr] for nvr-edge so edge waits for the app../config:/app/config.FFmpeg reality check
hls_part_size/part_inf in our env → decided to not rely on FFmpeg for #EXT-X-PART.LL-HLS sidecar (MediaMTX)
packager (bluenviron/mediamtx) on :8888, in nvr-net.New config file: packager/mediamtx.yml
hls: yes, hlsVariant: lowLatencyhlsSegmentCount: 7, hlsSegmentDuration: 1s, hlsPartDuration: 200msPath REOLINK_OFFICE:
source: rtsp://admin:xxxxxxxxxxxxxxxxxxxxxxx@192.168.10.88:554/h264Preview_01_subrtspTransport (aka sourceProtocol) set to TCP (UDP caused decode errors/packet loss).sourceOnDemand: no to keep it constantly up for debugging./hls/… → nvr-packager:8888 (HTTP/2 at the edge, self-signed cert).hlsSegmentCount to 7.Player tuning (interim)
To avoid spinner with classic HLS (no PARTs), relaxed hls.js edge:
liveSyncDurationCount: 2 (was 1)liveMaxLatencyDurationCount: 3 (was 2)https://<server>/streams/... = legacy HLS from unified-nvr (works as before).https://<server>/hls/REOLINK_OFFICE/index.m3u8 = MediaMTX LL-HLS path via edge.
MediaMTX is running, RTSP pull is stable over TCP, and the HLS muxer is created. Use this URL in the UI to test LL-HLS; manifest should include #EXT-X-PART (when the mux has filled enough segments).http://<ip>:5000 bypassed the edge → added loopback bind for app port."~" path literal → switched to ${HOME}.MediaMTX 404s were due to:
/hls/..., not /streams/...), andhlsSegmentCount: 7./hls/<CAM>/index.m3u8 for LL-HLS./llhls/ alias if we want a clean split for testing.hlsLowLatencyMaxAge / cache headers fine-tuning.nginx proxies /hls/ to nvr-packager:8888/ (note the trailing slash). Gzip disabled for /hls/ and Accept-Encoding cleared.EXT-X-PART, SERVER-CONTROL, PRELOAD-HINT./hls/.lowLatencyMode: true works when using the same origin (e.g., https://localhost/hls/<CAM>/index.m3u8).cameras.json (guinea pig) REOLINK_OFFICE
"stream_type": "LL_HLS""packager_path": "REOLINK_OFFICE""ll_hls": { ... } block added:
publisher: protocol: "rtsp", host: "nvr-packager", port: 8554, path: "REOLINK_OFFICE".video: ffmpeg-style keys (c:v, r, g, keyint_min, x264-params, vf, etc.).audio: "enabled": false for tight LL (can enable later)."__notes" block added (purely informational).location ^~ /hls/ { proxy_pass http://nvr-packager:8888/; gzip off; proxy_set_header Accept-Encoding ""; proxy_buffering off; proxy_request_buffering off; … }REOLINK_OFFICE path (no camera source:) so the NVR publishes a 1s GOP stream to the packager (RTSP or RTMP — chosen by JSON).ffmpeg_params.py
FFmpegHLSParamBuilder.build_ll_hls_publish_output(ll_hls_cfg) — emits output args for publishing (RTSP or RTMP), fully driven by cameras.json (no hardcoding).Added helpers:
build_ll_hls_input_publish_params(camera_config) → mirrors build_rtsp_input_params (input flags).build_ll_hls_output_publish_params(camera_config, vendor_prefix) → calls the new builder method (output flags).Vendor handlers (Reolink/Unifi/Eufy)
New private method in each:
_build_ll_hls_publish(self, camera_config, rtsp_url) -> (argv, play_url)
["ffmpeg", <input>, "-i", rtsp_url, <output>]play_url = "/hls/<packager_path|serial|id>/index.m3u8"stream_manager.py
start_stream() and _start_stream() now recognize "LL_HLS":
_build_ll_hls_publish(...), spawn publisher, store protocol: "ll_hls" and stream_url./hls/<path>/index.m3u8 for LL-HLS cams.get_stream_url() returns stored stream_url when protocol == "ll_hls".stop_stream() kills the publisher process and skips filesystem cleanup for LL-HLS.app.py
stream_url from start_stream().If camera.stream_type === "LL_HLS":
stream_url as the player src.lowLatencyMode: true and the tight live-edge settings we verified (or auto-tune from SERVER-CONTROL/PART-INF as we did).TARGETDURATION and ~3–5s latency. Publishing a 1s GOP stream (with the chosen encoder settings) lets MediaMTX produce short segments/parts and stay ~1–2s.Sweet—picking up from UI implementation only. Here’s the tight plan (no code yet):
Use the URL the API returns
/api/stream/start/<id>, use res.stream_url as-is for the player source. Don’t reconstruct /streams/... for LL_HLS cams.Detect LL_HLS and init the player accordingly
If camera.stream_type === 'LL_HLS':
lowLatencyMode: true + your tuned settings (or auto-tune from SERVER-CONTROL + PART-INF).Else (classic HLS): keep your existing path.
Keep native fallback
video.canPlayType('application/vnd.apple.mpegurl') is true, set video.src = stream_url (especially on iOS/Safari). Otherwise use hls.js.Hide/adjust controls for LL_HLS
Health badge via playlist probe
#EXT-X-PART count or MEDIA-SEQUENCE increases → show “Live”. If fetch fails or stalls for N intervals → show “Stalled”.Latency readout (tiny overlay)
#EXT-X-PROGRAM-DATE-TIME and show now - PDT as an approximate latency badge (only for LL_HLS). Useful for regressions.Per-camera toggle (optional)
cameras.json (that’s ops-owned). Persist per-user in localStorage if helpful.Edge quirks guardrails
https://<current-origin>/hls/... (no hardcoded hostnames).Accept-Encoding headers from the client (edge already strips them)./hls/ requests.Achieved first successful LL-HLS stream through complete integration of camera → FFmpeg publisher → MediaMTX packager → Browser pipeline. Resolved critical FFmpeg static build segfault bug by reverting to Ubuntu’s native FFmpeg 6.1.1 package. Stream now delivers ~1-2 second latency as designed.
Initial State:
/hls/ → MediaMTXProblem 1: Frontend Not Calling Backend
stream.js only recognized "HLS", "RTMP", "mjpeg_proxy" - not "LL_HLS"|| streamType === 'LL_HLS' condition to use HLS manager for LL_HLS streamsProblem 2: FFmpeg Commands Generated But Streams Failed
/hls/REOLINK_OFFICE/index.m3u8ps aux later (died after check window)Problem 3: RTSP Transport Protocol Mismatch
ffmpeg_params.pyFix: Modified build_ll_hls_publish_output() to read rtsp_transport from ll_hls.publisher config
rtsp_transport = pub.get("rtsp_transport", "tcp")
out += ["-f", "rtsp", "-rtsp_transport", rtsp_transport, sink]
-muxpreload 0 -muxdelay 0 (unnecessary for RTSP output)Problem 4: Python Bytecode Caching
ffmpeg_params.py not taking effect after container restart.pyc files cached, some in read-only /app/config/__pycache__./deploy.sh required for code changes (volume mounts not configured for hot reload)Problem 5: FFmpeg Static Build Segmentation Fault
ffmpeg ... -f rtsp -rtsp_transport udp rtsp://nvr-packager:8554/REOLINK_OFFICE → Segmentation fault (core dumped)Fix: Reverted Dockerfile to use Ubuntu’s native FFmpeg:
RUN apt-get update && apt-get install -y \
curl \
ffmpeg \ # ← Re-enabled native package
nodejs \
npm \
procps \
&& rm -rf /var/lib/apt/lists/*
# Removed: Static FFmpeg download and installation
cameras.json (REOLINK_OFFICE):
{
"stream_type": "LL_HLS",
"ll_hls": {
"publisher": {
"protocol": "rtsp",
"host": "nvr-packager",
"port": 8554,
"path": "REOLINK_OFFICE",
"rtsp_transport": "udp" // ← Critical for low latency
},
"video": {
"c:v": "libx264",
"preset": "veryfast",
"tune": "zerolatency",
"profile:v": "baseline",
"pix_fmt": "yuv420p",
"r": 30,
"g": 15,
"keyint_min": 15,
"b:v": "800k",
"maxrate": "800k",
"bufsize": "1600k",
"x264-params": "scenecut=0:min-keyint=15:open_gop=0",
"force_key_frames": "expr:gte(t,n_forced*1)",
"vf": "scale=640:480"
},
"audio": {
"enabled": false
}
},
"rtsp_input": {
"rtsp_transport": "udp", // ← UDP avoids RTP packet corruption
"timeout": 5000000,
"analyzeduration": 1000000,
"probesize": 1000000,
"use_wallclock_as_timestamps": 1,
"fflags": "nobuffer"
}
}
Working FFmpeg Command:
ffmpeg -rtsp_transport udp -timeout 5000000 -analyzeduration 1000000 \
-probesize 1000000 -use_wallclock_as_timestamps 1 -fflags nobuffer \
-i rtsp://admin:PASSWORD@192.168.10.88:554/h264Preview_01_sub \
-an -c:v libx264 -preset veryfast -tune zerolatency \
-profile:v baseline -pix_fmt yuv420p -r 30 -g 15 -keyint_min 15 \
-b:v 800k -maxrate 800k -bufsize 1600k \
-x264-params scenecut=0:min-keyint=15:open_gop=0 \
-force_key_frames expr:gte(t,n_forced*1) -vf scale=640:480 \
-f rtsp -rtsp_transport udp rtsp://nvr-packager:8554/REOLINK_OFFICE
Stream Flow:
/hls/* to MediaMTX:8888lowLatencyMode: trueManual Verification:
Final Choice: RTSP+UDP for best latency
Browser Playback:
const video = document.createElement('video');
video.controls = true;
video.style.cssText = 'position:fixed;top:10px;right:10px;width:400px;z-index:9999;border:2px solid red';
document.body.appendChild(video);
if (Hls.isSupported()) {
const hls = new Hls({lowLatencyMode: true});
hls.loadSource('/hls/REOLINK_OFFICE/index.m3u8');
hls.attachMedia(video);
hls.on(Hls.Events.MANIFEST_PARSED, () => video.play());
}
Result: ✅ Stream plays with ~1-2 second latency
.pyc files can mask code changes; full rebuild ensures clean stateModified Files:
static/js/streaming/stream.js: Added LL_HLS to stream type routerstreaming/ffmpeg_params.py: Made RTSP transport configurable, removed muxdelay/muxpreloadDockerfile: Reverted to Ubuntu FFmpeg 6.1.1 native packageconfig/cameras.json: Added LL_HLS configuration for REOLINK_OFFICEStream Types Operational:
Performance:
Session completed: October 19, 2025, 06:15 AM
Status: LL-HLS operational with target latency achieved, ready for Amcrest integration
Known Issues:
Initial page load sometimes fails to initialize hls.js properly for LL-HLS streams (readyState: 0, no HLS manager instance). Page reload resolves the issue. Likely race condition between stream start and hls.js initialization or module loading order. Requires investigation of JavaScript initialization sequence in stream.js and hls-stream.js.
After some time UI stream freezes despite logs telling a different story:
nvr-edge | 192.168.10.110 - - [19/Oct/2025:06:25:12 +0000] "POST /api/stream/start/T8441P122428038A HTTP/2.0" 200 191 "https://192.168.10.15/streams" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/141.0.0.0 Safari/537.36" "-"
nvr-edge | 192.168.10.110 - - [19/Oct/2025:06:25:12 +0000] "POST /api/stream/start/REOLINK_OFFICE HTTP/2.0" 200 186 "https://192.168.10.15/streams" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/141.0.0.0 Safari/537.36" "-"
nvr-edge | 192.168.10.110 - - [19/Oct/2025:06:25:12 +0000] "POST /api/stream/start/REOLINK_TERRACE HTTP/2.0" 200 201 "https://192.168.10.15/streams" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/141.0.0.0 Safari/537.36" "-"
nvr-edge | 192.168.10.110 - - [19/Oct/2025:06:25:12 +0000] "POST /api/stream/start/REOLINK_LAUNDRY HTTP/2.0" 200 199 "https://192.168.10.15/streams" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/141.0.0.0 Safari/537.36" "-"
nvr-edge | 192.168.10.110 - - [19/Oct/2025:06:25:12 +0000] "GET /hls/REOLINK_OFFICE/index.m3u8 HTTP/2.0" 404 18 "https://192.168.10.15/streams" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/141.0.0.0 Safari/537.36" "-"
Critical debugging and optimization session that successfully reduced LL-HLS latency from 4-5 seconds felt (2.9s measured) down to 1.0-1.8 seconds through systematic diagnosis and tuning. Resolved paradoxical situation where regular HLS had lower latency than LL-HLS. Implemented comprehensive per-camera player configuration system with hot-reload support. Fixed fullscreen mode failures and UI initialization issues.
Starting problems (early afternoon):
down && up, not just restart)Critical observation:
“Currently, regular HLS mode has lower latency than LL-HLS…”
This indicated fundamental misconfiguration - LL-HLS should ALWAYS be faster than regular HLS.
Initial configuration:
# FFmpeg publisher command
ffmpeg -rtsp_transport udp -r 30 -g 15 -keyint_min 15 \
-f rtsp -rtsp_transport tcp rtsp://nvr-packager:8554/REOLINK_OFFICE
# MediaMTX
hlsSegmentDuration: 1s
hlsSegmentCount: 7 # 7s buffer!
# Player settings
liveSyncDurationCount: 2 # 2s behind live
Root causes identified:
/api/cameras/<id> endpointwindow.multiStreamManager not exposed (couldn’t debug player config)$(document).ready() blocks causing initialization issuesTesting sequence (documenting for future reference):
docker compose restart → ❌ Config not reloadeddocker compose down && up → ✅ Config reloaded successfully./config:/app/config:rwKey finding: Hot-reload works with down && up but NOT with restart alone.
UDP vs TCP publisher testing:
Browser console investigation revealed:
window.multiStreamManager?.fullscreenHls?.config
// Result: undefined - manager not exposed!
Actions taken:
$(document).ready() blockswindow.multiStreamManager globallyAdded backend endpoint:
@app.route('/api/cameras/<camera_id>')
def api_camera_detail(camera_id):
camera = camera_repo.get_camera(camera_id)
return jsonify(camera)
./templates:/app/templates for template hot-reloadResult after fixes:
console.log('Manager exists:', !!window.multiStreamManager); // true
console.log('HLS config:', hls.config.liveSyncDurationCount); // 2
Manager now accessible, but settings still not optimal.
Analysis:
MediaMTX: 7 segments × 1s = 7s theoretical buffer
Measured: 2.9s (player playing ahead of buffer)
Problem: 7s buffer is ridiculous for "low latency"
Changes to packager/mediamtx.yml:
hlsSegmentDuration: 500ms # Changed from 1s
hlsPartDuration: 200ms # Kept (half of segment)
hlsSegmentCount: 7 # Minimum required by MediaMTX
# New buffer: 7 × 500ms = 3.5s
FFmpeg GOP alignment (cameras.json):
"r": 30,
"g": 7, // Changed from 15 (7 frames @ 30fps = 233ms)
"keyint_min": 7, // Match g for fixed GOP
Results:
Key insight: GOP (233ms) now fits cleanly in segment (500ms), allowing MediaMTX to cut segments properly.
Problem: No way to configure hls.js per-camera from cameras.json.
Architecture implemented:
"player_settings": {
"hls_js": {
"enableWorker": true,
"lowLatencyMode": true,
"liveSyncDurationCount": 1,
"liveMaxLatencyDurationCount": 2,
"maxLiveSyncPlaybackRate": 1.5,
"backBufferLength": 5
}
}
/api/cameras/<camera_id> returns full camera configcamera_repo.get_camera(camera_id) methodasync getCameraConfig(cameraId) {
const response = await fetch(`/api/cameras/${cameraId}`);
return await response.json();
}
buildHlsConfig(cameraConfig, isLLHLS) {
const defaults = isLLHLS ? {
liveSyncDurationCount: 1, // Aggressive
liveMaxLatencyDurationCount: 2
} : {
liveSyncDurationCount: 3, // Conservative
liveMaxLatencyDurationCount: 5
};
return { ...defaults, ...cameraConfig?.player_settings?.hls_js };
}
constructor() {
this.hlsManager = new HLSStreamManager();
// Reuse HLS manager methods for fullscreen
this.getCameraConfig = (id) => this.hlsManager.getCameraConfig(id);
this.buildHlsConfig = (cfg, isLL) => this.hlsManager.buildHlsConfig(cfg, isLL);
}
Player settings applied:
"liveSyncDurationCount": 1, // From 2
"liveMaxLatencyDurationCount": 2, // From 3
"backBufferLength": 5 // From 10
Verification in console:
const hls = window.multiStreamManager?.fullscreenHls;
console.log('liveSyncDurationCount:', hls.config.liveSyncDurationCount); // 1
console.log('liveMaxLatencyDurationCount:', hls.config.liveMaxLatencyDurationCount); // 2
Results:
Goal: Push to MediaMTX architectural limits.
Observation: Latency at 1.4-1.8s with 500ms segments, but could we go lower?
Final MediaMTX configuration:
hlsSegmentDuration: 200ms # Minimum supported by MediaMTX
hlsPartDuration: 100ms # Always half of segment
hlsSegmentCount: 7 # Cannot go below 7
hlsAlwaysRemux: yes # Stable timing
# New buffer: 7 × 200ms = 1.4s minimum
Final FFmpeg configuration:
"r": 15, // Reduced from 30fps
"g": 3, // 3 frames @ 15fps = 200ms (matches segment!)
"keyint_min": 3,
"x264-params": "scenecut=0:min-keyint=3:open_gop=0"
Rationale for 15fps:
Final player configuration:
"player_settings": {
"hls_js": {
"liveSyncDurationCount": 0.5, // 0.5 × 200ms = 100ms behind
"liveMaxLatencyDurationCount": 1.5, // Max 300ms drift
"maxLiveSyncPlaybackRate": 2.0, // Faster catchup
"backBufferLength": 3 // Minimal buffer
}
}
Interesting observation:
“Previous settings: 1.0-2.0s, now: 1.5-2.3s after first change”
Settings initially made latency WORSE! This indicated player wasn’t keeping up with 200ms segments using old settings.
After ultra-aggressive player settings:
“Final result: 1.0-1.8s”
Success! Player now properly synchronized with rapid 200ms segments.
Problem: REOLINK_OFFICE fullscreen immediately closed with error.
Root cause analysis:
// Error in console
ReferenceError: startInfo is not defined
Issue: startInfo referenced before definition due to scope error.
Fix applied:
async openFullscreen(serial, name, cameraType, streamType) {
if (streamType === 'HLS' || streamType === 'LL_HLS' || streamType === 'NEOLINK' || streamType === 'NEOLINK_LL_HLS') {
const response = await fetch(`/api/stream/start/${serial}`, {...});
// Fetch stream metadata from backend after starting.
// Returns: { protocol: 'll_hls'|'hls'|'rtmp', stream_url: '/hls/...' or '/api/streams/...', camera_name: '...' }
// This tells us what the backend ACTUALLY started (vs what's configured in cameras.json)
// Used to determine the correct playlist URL and verify the stream type matches expectations.
const startInfo = await response.json().catch(() => ({}));
// Choose correct URL based on what backend started
let playlistUrl;
if (startInfo?.stream_url?.startsWith('/hls/')) {
playlistUrl = startInfo.stream_url; // LL-HLS from MediaMTX
} else {
playlistUrl = `/api/streams/${serial}/playlist.m3u8?t=${Date.now()}`;
}
// Get camera config and build player settings
const cameraConfig = await this.getCameraConfig(serial);
const isLLHLS = cameraConfig?.stream_type === 'LL_HLS';
const hlsConfig = this.buildHlsConfig(cameraConfig, isLLHLS);
this.fullscreenHls = new Hls(hlsConfig);
// ...
}
}
Additional fixes:
else if (streamType === 'RTMP') {
this.fullscreenFlv = flvjs.createPlayer({
type: 'flv',
url: `/api/camera/${serial}/flv?t=${Date.now()}`,
isLive: true
});
}
destroyFullscreenFlv() for RTMP streamscloseFullscreen() to handle all typesResult: Fullscreen working for all stream types (HLS, LL-HLS, RTMP, MJPEG).
Problem: CSS element visible but no values displayed.
Root cause: Latency meter code working, but initialization timing issue.
Fix: Already included in _attachLatencyMeter() and _attachFullscreenLatencyMeter() methods in HLSStreamManager.
Verification:
__notes SystemAdded comprehensive inline documentation to cameras.json:
Example documentation style:
"g": {
"value": 3,
"description": "GOP (Group of Pictures) size in frames",
"calculation": "3 frames ÷ 15 fps = 200ms GOP",
"critical": "Must be ≤ segment duration for clean cuts",
"must_match_keyint_min": "Set g = keyint_min for fixed GOP"
}
Neutral architecture documentation:
"architecture": {
"flow": {
"LL_HLS": "Camera RTSP → FFmpeg Publisher → MediaMTX → Edge → Browser",
"HLS": "Camera RTSP → FFmpeg Transcoder → Edge → Browser",
"RTMP": "Camera RTSP → FFmpeg Transcoder → Edge → Browser (flv.js)"
}
}
Complete working configuration:
packager/mediamtx.yml:
hls: yes
hlsAddress: :8888
hlsVariant: lowLatency
hlsSegmentCount: 7 # Minimum required (cannot reduce)
hlsSegmentDuration: 200ms # Minimum supported
hlsPartDuration: 100ms # Half of segment
hlsAllowOrigin: "*"
hlsAlwaysRemux: yes
cameras.json (REOLINK_OFFICE):
{
"stream_type": "LL_HLS",
"packager_path": "REOLINK_OFFICE",
"player_settings": {
"hls_js": {
"enableWorker": true,
"lowLatencyMode": true,
"liveSyncDurationCount": 0.5,
"liveMaxLatencyDurationCount": 1.5,
"maxLiveSyncPlaybackRate": 2.0,
"backBufferLength": 3
}
},
"ll_hls": {
"publisher": {
"protocol": "rtsp",
"host": "nvr-packager",
"port": 8554,
"path": "REOLINK_OFFICE",
"rtsp_transport": "tcp"
},
"video": {
"c:v": "libx264",
"preset": "veryfast",
"tune": "zerolatency",
"profile:v": "baseline",
"pix_fmt": "yuv420p",
"r": 15,
"g": 3,
"keyint_min": 3,
"b:v": "800k",
"maxrate": "800k",
"bufsize": "1600k",
"x264-params": "scenecut=0:min-keyint=3:open_gop=0",
"force_key_frames": "expr:gte(t,n_forced*1)",
"vf": "scale=640:480"
},
"audio": {
"enabled": false
}
}
}
Measured results:
Latency breakdown:
MediaMTX buffer: 1.4s (7 × 200ms segments)
Player offset: 0.1s (0.5 × 200ms)
Network/processing: 0-0.4s (variance)
──────────────────────────
Total measured: 1.0-1.8s
Critical blockers:
hlsSegmentCount >= 7docker compose restart does NOT reload configdocker compose down && upThe paradox explained:
Regular HLS pipeline:
Camera → FFmpeg → Disk → NGINX → Browser
Latency: 0.5-1s segments, no intermediate transcoding
Initial LL-HLS pipeline:
Camera → FFmpeg → MediaMTX (7×1s buffer) → NGINX → Browser
Latency: Extra transcoding hop + 7s buffer = HIGHER than regular!
The fix:
Camera → FFmpeg → MediaMTX (7×200ms buffer) → NGINX → Browser
Latency: Extra hop offset by aggressive segmentation = LOWER than regular
Key insights:
#EXT-X-PART support (FFmpeg can’t do this)Why FFmpeg can’t do LL-HLS directly:
ffmpeg -hls_partial_duration 0.2 ...
# Error: Unrecognized option 'hls_partial_duration'
#EXT-X-PART support in Debian/Ubuntu buildsGOP alignment mathematics:
15fps stream:
- GOP of 3 frames = 3 ÷ 15 = 0.200s = 200ms ✓
- Matches segment duration exactly
- Clean cuts at segment boundaries
30fps stream (previous):
- GOP of 7 frames = 7 ÷ 30 = 0.233s = 233ms
- Fits in 500ms segments but not 200ms
- Would need GOP of 3 frames (100ms) for 200ms segments at 30fps
Player aggressiveness trade-offs:
Conservative (Regular HLS):
"liveSyncDurationCount": 3 // 3 segments behind = safe
"liveMaxLatencyDurationCount": 5 // Allow 5 segments drift
Aggressive (LL-HLS):
"liveSyncDurationCount": 0.5 // 0.5 segments = risky
"liveMaxLatencyDurationCount": 1.5 // Tight tolerance
Trade-off: Lower latency vs rebuffering risk
Why 15fps is optimal:
Per LL-HLS stream (final config):
Comparison: 30fps → 15fps:
| Metric | 30fps | 15fps | Savings |
|---|---|---|---|
| Bandwidth | 800 kbps | 400 kbps | 50% |
| CPU | 6-8% | 4-5% | ~35% |
| Latency | Same (GOP aligned) | Same | 0% |
| Quality | Imperceptible difference for surveillance | - | - |
Skills practiced:
Mistakes made and fixed:
startInfo variablerestart (learned: needs down && up)Best debugging moment:
“Previous settings: 1.0-2.0s, now 1.5-2.3s… wait, that’s worse!”
Realized more aggressive segments need more aggressive player settings. Adjusted and got 1.0-1.8s. Measuring and iterating works!
This is NOT production-ready (and that’s okay):
But we learned a TON, and that’s the whole point! 🎓
Immediate:
player_settings to all camerasShort-term:
Medium-term:
Long-term (if actually wanted production):
feat: LL-HLS optimization pipeline (4.5s → 1.0-1.8s latency)
Critical fixes:
- Resolve paradox: regular HLS faster than LL-HLS
- Reduce MediaMTX segments: 1s → 200ms (minimum)
- Optimize FFmpeg GOP: 15fps @ 3 frames = 200ms alignment
- Implement per-camera player settings system
- Fix fullscreen mode for all stream types
- Add /api/cameras/<id> endpoint for config retrieval
- Restore latency counter display
- Document complete configuration in __notes
Architecture:
- Smart defaults by stream_type (LL_HLS vs HLS)
- Camera-specific overrides via player_settings.hls_js
- Hot-reload support (docker compose down && up)
- Code reuse between tile/fullscreen via arrow functions
Results:
- Measured latency: 1.0-1.8s (avg 1.4s)
- Bandwidth: 50% reduction (15fps vs 30fps)
- CPU: 30-40% reduction per stream
- Stable over testing period
Known issues:
- UDP publisher still freezes (TCP workaround adds ~1s)
- Initial load race condition (reload fixes)
- Latency degradation over time (monitoring needed)
This is a personal training project, not production-ready.
See README_project_history.md for complete session notes.
Session Duration: ~6 hours (early afternoon through evening)
Coffee consumed: Probably too much ☕
Power wasted: Definitely too much 🔌
Knowledge gained: Priceless! 🧠
Diagnosed and resolved streaming issues with Reolink TERRACE camera (192.168.10.89) through systematic hardware troubleshooting. Root cause identified as corroded RJ45 contacts from outdoor exposure. Discovered Reolink’s proprietary Baichuan protocol (port 9000) and open-source Neolink bridge for ultra-low-latency streaming.
Initial Symptoms:
Invalid data found when processing inputCould not find codec parameters for stream 0 (Video: h264, none): unspecified sizeInitial Hypotheses Tested:
Diagnostic Evidence:
# Before cleaning - corrupted stream metadata
Stream #0:0: Video: h264, none, 90k tbr, 90k tbn
[rtsp @ 0x...] Could not find codec parameters
# After cleaning with 90% isopropyl alcohol
Stream #0:0: Video: h264 (High), yuv420p(progressive), 640x480, 90k tbr, 90k tbn
# Stream working perfectly!
Network Topology:
Resolution:
Packet Capture Analysis:
Used Wireshark on Windows native Reolink app to discover actual protocol:
# Captured from 192.168.10.110 (Windows PC) → 192.168.10.89 (camera)
sudo tcpdump -r capture.pcap -nn | grep -oP '192\.168\.10\.89\.\K[0-9]+'
Results:
- Port 9000: ✅ Primary traffic (proprietary "Baichuan" protocol)
- Port 554 (RTSP): ❌ Not used by native app
- Port 1935 (FLV): ❌ Not used
- Port 80 (HTTP): ❌ Not used
Native App Latency: ~100-300ms (near real-time)
Our RTSP Latency: ~1-2 seconds (acceptable but not ideal)
Protocol Details:
Discovery: Open-source project already exists to bridge Baichuan → RTSP
Project: Neolink (actively maintained fork)
Architecture:
[Reolink Camera:9000] ←Baichuan→ [Neolink:8554] ←RTSP→ [NVR/FFmpeg] ←HLS→ [Browser]
Proprietary Bridge/Proxy Your existing stack
Expected latency: ~600ms-1.5s (vs current 1-2s)
What Neolink Does:
Phase 1: Neolink Installation & Testing
~/neolink/cargo build --release (5-15 min compile time)~/0_NVR/config/neolink.tomlPhase 2: Integration Strategy
reolink_stream_handler.py to use Neolink RTSP endpointDockerfile to include Rust/Neolink binaryPhase 3: Production Deployment
Guinea Pig Selection: Camera .88 (REOLINK_OFFICE @ 192.168.10.88)
Files to modify for Neolink integration:
Dockerfile - Add Rust build stage and Neolink binarydocker-compose.yml - Expose port 8554 for Neolink RTSP~/0_NVR/config/neolink.toml - New config file for Neolinkstreaming/handlers/reolink_stream_handler.py - Update to use localhost:8554Session completed: October 22, 2025, 11:45 PM EDT
Status: Camera .89 fixed (hardware), Neolink integration ready to begin
Continuation: Next chat will cover Neolink build, Docker integration, and latency testing
Key Achievement: Reduced troubleshooting time from days to hours through systematic hypothesis testing and creative thinking about “shitty outdoor wiring since 2022” 🎯
---
## Transition Note for Next Chat
**Resume with:**
Continuing Neolink integration for Reolink cameras. Last session: fixed camera .89 via RJ45 cleaning, discovered Baichuan protocol (port 9000), cloned Neolink repo, installed Rust.
Next steps:
Current status: Ready to build, taking it one step at a time.
See also: Neolink Integration Plan (DOCS/README_neolink_integration_plan.md)
Planned integration of Neolink bridge for Reolink cameras to reduce latency from ~1-2s to ~600ms-1.5s using proprietary Baichuan protocol (port 9000). Created comprehensive integration scripts and documentation. Build failed due to missing system dependencies.
Current Flow:
Camera:554 (RTSP) -> FFmpeg -> HLS -> Browser (~1-2s latency)
Target Flow:
Camera:9000 (Baichuan) -> Neolink:8554 (RTSP) -> FFmpeg -> HLS -> Browser (~600ms-1.5s)
update_neolink_configuration.sh (~/0_NVR/)
config/neolink.toml from cameras.jsonstream_type: "NEOLINK"$REOLINK_USERNAME, $REOLINK_PASSWORD)jq for JSON parsingNEOlink_integration.sh (~/0_NVR/0_MAINTENANCE_SCRIPTS/)
Issue 1: Missing C Compiler
error: linker `cc` not found
Solution: Install build-essential
sudo apt-get install -y build-essential pkg-config libssl-dev
Issue 2: Missing GStreamer RTSP Server (BLOCKING)
The system library `gstreamer-rtsp-server-1.0` required by crate `gstreamer-rtsp-server-sys` was not found.
Solution Required:
sudo apt-get install -y libgstreamer1.0-dev libgstreamer-plugins-base1.0-dev gstreamer1.0-plugins-base gstreamer1.0-plugins-good gstreamer1.0-rtsp
Backend Changes:
reolink_stream_handler.py: Check stream_type, route to localhost:8554/{serial}/{stream} for NEOLINKstream_manager.py: Add NEOLINK to valid stream typescameras.json: Add "neolink" section with baichuan_port, rtsp_path, enabledFrontend Changes:
stream.js: Add NEOLINK to HLS routing (lines ~299, 321, 240)Docker Integration:
Dockerfile: Copy neolink binary + configdocker-compose.yml: Expose port 8554 internally~/0_NVR/update_neolink_configuration.sh (NEW)~/0_NVR/0_MAINTENANCE_SCRIPTS/NEOlink_integration.sh (NEW)~/0_NVR/neolink/ (cloned from GitHub)~/0_NVR/neolink_integration_updates.md (design doc)Blocked: Neolink build failing due to missing GStreamer RTSP server library
Ready: Scripts and architecture designed
Pending: System dependency installation, then continue with Step 1
Session ended: October 23, 2025 Continuation: Install GStreamer deps, complete build, test standalone
Successfully completed Steps 1-3 of Neolink integration. Built Neolink binary from source, generated configuration for two Reolink cameras, and validated standalone RTSP bridge functionality. Ready for backend Python integration (Step 4).
Reduce Reolink camera streaming latency from ~1-2 seconds (direct RTSP) to ~600ms-1.5s using Neolink bridge with Baichuan protocol (Reolink’s proprietary protocol on port 9000).
Challenge: Rust cargo build failed due to missing GStreamer system dependencies
Errors Encountered:
error: failed to run custom build command for `gstreamer-sys v0.23.0`
The system library `gstreamer-rtsp-server-1.0` required by crate `gstreamer-rtsp-server-sys` was not found
Resolution Process:
NEOlink_integration.sh
gstreamer-rtsp-server-1.0
pkg-config --list-all | grep gstreamerlibgstrtspserver-1.0-dev
Build success:
Finished `release` profile [optimized] target(s) in 1m 01s
Binary: /home/elfege/0_NVR/neolink/target/release/neolink (17MB)
Version: Neolink v0.6.3.rc.2-28-g6e05e78 release
Script Improvements:
check_gstreamer_dependencies() function in NEOlink_integration.shFiles Modified:
NEOlink_integration.sh: Added GStreamer dependency check function (lines 93-166)Challenge: Script had multiple issues preventing config generation
Issues Fixed:
pkill -9 "${BASH_SOURCE[1]}" at line 92cd "$SCRIPT_DIR/.." going to /home/elfege/exit 1cd "$SCRIPT_DIR" to stay in /home/elfege/0_NVR/.devices | to_entries[]cameras.json has cameras at root level, not in .devices wrapper.devices from query.devices wrapper (line 7).devices | to jq query + added safe navigation with ? operatorUI_HEALTH_* settings) at end of JSON caused jq to failto_entries[]select(.value | type == "object" and has("stream_type")...)Working jq Query:
jq -r '.devices | to_entries[] |
select(.value.stream_type? == "NEOLINK" and .value.type? == "reolink") |
@json' cameras.json
Configuration Generated:
~/0_NVR/config/neolink.tomlget_cameras_credentials) in password${REOLINK_PASSWORD} env var expansionFiles Modified:
update_neolink_configuration.sh:
Challenge: RTSP server failed to bind to port 8554
Initial Symptoms:
[INFO] Starting RTSP Server at 0.0.0.0:8554:8554 # Note: double port!
# But: netstat -tlnp | grep 8554 → (empty, not listening)
Root Cause: Neolink config parser bug
bind = "0.0.0.0:8554"0.0.0.0:8554:8554 (malformed)Solution: Changed bind format in neolink.toml
# Before (failed):
bind = "0.0.0.0:8554"
# After (working):
bind = "0.0.0.0"
bind_port = 8554
Validation Tests:
Port listening confirmed:
$ sudo lsof -i :8554
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
neolink 3603740 elfege 10u IPv4 711264660 0t0 TCP *:8554 (LISTEN)
Baichuan connection successful:
[INFO] REOLINK_OFFICE: TCP Discovery success at 192.168.10.88:9000
[INFO] REOLINK_OFFICE: Connected and logged in
[INFO] REOLINK_OFFICE: Model RLC-410-5MP
[INFO] REOLINK_OFFICE: Firmware Version v3.0.0.2356_23062000
[INFO] REOLINK_OFFICE: Available at /REOLINK_OFFICE/main, /REOLINK_OFFICE/mainStream...
RTSP stream validation:
$ ffmpeg -rtsp_transport tcp -i rtsp://localhost:8554/REOLINK_OFFICE/main -t 5 -f null -
Input #0, rtsp, from 'rtsp://localhost:8554/REOLINK_OFFICE/main':
Stream #0:0: Video: h264 (High), yuv420p(progressive), 2560x1920, 30 fps
Stream #0:1: Audio: pcm_s16be, 16000 Hz, stereo, 512 kb/s
frame=120 fps=22 q=-0.0 Lsize=N/A time=00:00:04.99 bitrate=N/A speed=0.913x
Stream Specifications Confirmed:
Files Modified:
update_neolink_configuration.sh: Updated bind format generation (line ~139)neolink.toml: Manual fix applied (to be regenerated by script)Camera:9000 (Baichuan) → Neolink:8554 (RTSP) → [Ready for FFmpeg integration]
↓ ↓
192.168.10.88 localhost:8554
TCP Discovery Available paths:
Logged in ✅ - /REOLINK_OFFICE/main
H.264 5MP 30fps - /REOLINK_OFFICE/mainStream
- /REOLINK_TERRACE/main
- /REOLINK_TERRACE/mainStream
stream_type: "LL_HLS" (direct RTSP)stream_type: "NEOLINK" (Baichuan protocol)stream_type: "LL_HLS" (direct RTSP)stream_type: "NEOLINK" (Baichuan protocol)~/0_NVR/neolink/target/release/neolink (17MB binary)~/0_NVR/config/neolink.toml (auto-generated configuration)~/0_NVR/0_MAINTENANCE_SCRIPTS/NEOlink_integration.sh
check_gstreamer_dependencies() function~/0_NVR/update_neolink_configuration.sh
.devices wrapperbind = "0.0.0.0" + bind_port = 8554~/0_NVR/config/cameras.json
stream_type from “LL_HLS” to “NEOLINK”stream_type from “LL_HLS” to “NEOLINK”Hardware:
Software:
Pending: Update Python stream handlers to route NEOLINK cameras to Neolink bridge
Files to modify:
reolink_stream_handler.py
stream_type in build_rtsp_url()rtsp://localhost:8554/{serial}/mainStreamstream_manager.py
ffmpeg_params.py
Pending: Update browser stream routing
Files to modify:
stream.js
Pending: Package Neolink into unified-nvr container
Tasks:
Dockerfile
/usr/local/bin/neolink/app/config/neolink.tomldocker-compose.yml
Pending: End-to-end integration testing
Test plan:
Pending: Rollout to production
Deployment order:
Issue: Configuration file contains plaintext passwords with special characters Impact: Medium - file is in ~/0_NVR/config/ (not in Docker image, not in git) Options to investigate:
password = "${REOLINK_PASSWORD}"Notice: System has pending kernel upgrade (6.8.0-85 → 6.8.0-86) Impact: None on current work Action: Reboot when convenient (after Docker integration complete)
Notice: needrestart flagged Docker for restart
Impact: None - will restart on reboot
Action: No immediate action needed
reolink_stream_handler.pyDocumentation:
~/0_NVR/README_neolink_integration_plan.md~/0_NVR/0_MAINTENANCE_SCRIPTS/NEOlink_integration.sh~/0_NVR/update_neolink_configuration.shKey Commits/Changes:
Session End: October 23, 2025 @ 19:37 (ready to resume at Step 4)
Integrate Neolink bridge for Reolink cameras to reduce latency from ~2-3s to near real-time (~300ms-1s) using Baichuan protocol (port 9000).
Dockerfile.neolink with GStreamer runtime dependenciesneolink service to docker-compose.ymlnvr-net network./neolink/target/release/neolink./config/neolink.tomlreolink_stream_handler.py:
_build_NEOlink_url() methodrtsp://neolink:8554/{serial}/mainStreamstream_manager.py:
stream.js (6 locations):
|| streamType === 'NEOLINK' to treat as HLS streamscameras.json for REOLINK_OFFICE and REOLINK_TERRACE:
"stream_type": "NEOLINK""neolink": {"port": 8554}rtsp://neolink:8554/REOLINK_OFFICE/mainStreamBuffer full on vidsrc pausing streamdocker-compose.yml - Added neolink serviceDockerfile.neolink - New file with GStreamer depsreolink_stream_handler.py - Added _build_NEOlink_url()stream_manager.py - Added NEOLINK to stream type checksstream.js - Added NEOLINK to 6 conditional checkscameras.json - Changed stream_type to NEOLINK for 2 cameras"MJPEG" stream type with generic stream proxy (not snapshot-based like current mjpeg_proxy)Camera:9000 → Neolink:8554 (MJPEG) → Browser (~300ms expected)neolink for DNS resolutionstream_type back to "LL_HLS" in cameras.json for REOLINK camerasBottom Line: Neolink integration is 90% complete but hitting buffer/performance issues. The MJPEG direct proxy approach may be the breakthrough solution.
Project: Unified NVR System (Python Flask backend + JavaScript frontend)
Hardware: Dell PowerEdge R730xd running Proxmox, 12+ cameras (UniFi, Eufy, Reolink)
Primary Goal: Reduce Reolink camera latency from 2-4s to sub-1s using Neolink bridge
Camera:554 (RTSP) → FFmpeg → MediaMTX → HLS → Browser
app.py - Flask backendstream_manager.py - Core streaming logicreolink_stream_handler.py - Reolink-specific handlerstream.js - Frontend stream management (jQuery-based)cameras.json - Camera configurationdocker-compose.yml - Container orchestrationneolink service to docker-compose.yml (lines 137-147)Dockerfile.neolink with GStreamer dependencies./neolink/target/release/neolink./config/neolink.tomlnvr-net networkreolink_stream_handler.py:
_build_NEOlink_url() method (lines 105-126)rtsp://neolink:8554/{serial}/mainstream_type configurationstream_manager.py:
NEOLINK to stream type handling (line 456)NEOLINK_LL_HLS for LL-HLS publisher path (line 391)stream.js - Added checks in 6 locations:
cameras.json:
"stream_type": "NEOLINK" for both cameras"neolink": {"port": 8554} sectionconfig/neolink.toml (auto-generated):
bind = "0.0.0.0", bind_port = 8554buffer_size = 100 (default)stream = "mainStream"generate_neolink_config.sh:
neolink.toml from cameras.jsonstream_type="NEOLINK"start.sh during deployment[2025-10-24T07:09:44Z INFO neolink::rtsp::factory] Buffer full on vidsrc pausing stream until client consumes frames
[2025-10-24T07:11:33Z INFO neolink::rtsp::factory] Failed to send to source: App source is closed
Root Cause: Chain too slow
Camera:9000 → Neolink:8554 → FFmpeg → MediaMTX (LL-HLS) → Browser
buffer_size parameter in configAdded to cameras.json rtsp_input:
"buffer_size": 20000000, // 20MB
"rtsp_transport": "tcp", // Force TCP
"max_delay": 5000000
Result: No improvement
Changed buffer_size = 20 in neolink.toml
Result: Failed faster (as expected)
Research findings:
| Method | Latency | Status | Notes |
|---|---|---|---|
| Direct RTSP → HLS | 2-4s | ✅ Works | Baseline, acceptable |
| Direct RTSP → LL-HLS | 1.8s | ✅ Works | Best achieved |
| Neolink → HLS | 2.8s | ✅ Works | No improvement over direct |
| Neolink → LL-HLS | FAILS | ❌ Crashes | Buffer overflow |
Added NEOLINK_LL_HLS as dedicated stream type for testing:
if protocol == 'LL_HLS' or protocol == 'NEOLINK_LL_HLS':
# LL-HLS publisher path
Allows switching between modes in cameras.json:
"LL_HLS" - Direct RTSP with LL-HLS (1.8s) ✅"NEOLINK" - Neolink with regular HLS (2.8s)"NEOLINK_LL_HLS" - Neolink with LL-HLS (fails)"HLS" - Direct RTSP with regular HLS (2-4s)Browser-based HLS streaming unavoidable delays:
Total minimum: ~1.5-2.0s
Optimal configuration:
"stream_type": "LL_HLS",
"rtsp_input": {
"rtsp_transport": "tcp",
"timeout": 5000000,
"analyzeduration": 1000000,
"probesize": 1000000,
"use_wallclock_as_timestamps": 1,
"fflags": "nobuffer"
}
Achieves:
“Now the JS client could use mjpeg urls directly. I think REOLINK has an mjpeg api.”
Reolink cameras (RLC-410-5MP) may support native MJPEG via HTTP API:
mjpeg_proxy infrastructure (mjpeg-stream.js)http://camera/cgi-bin/api.cgi?...unifi_mjpeg_capture_service.py - Current snapshot-based systemmjpeg-stream.js - Frontend MJPEG playerArchitecture comparison
Option A (Current): Camera → FFmpeg → HLS → Browser (1.8s)
Option B (MJPEG): Camera → HTTP MJPEG → Browser (~500ms-1s?)
/mnt/project/unifi_mjpeg_capture_service.py - Current MJPEG implementation/mnt/project/mjpeg-stream.js - Frontend MJPEG playerdocker-compose.yml - Neolink service (lines 137-147)reolink_stream_handler.py - _build_NEOlink_url() methodstream_manager.py - NEOLINK/NEOLINK_LL_HLS handlingstream.js - 6 locations with NEOLINK checksgenerate_neolink_config.sh - Configuration generatorAll Neolink changes can be safely removed since:
REOLINK_USERNAME=admin
REOLINK_PASSWORD=<from get_cameras_credentials>
/mnt/project/tree.txtREADME_project_history.mdCurrent stable state: Direct RTSP + LL-HLS @ 1.8s latency
Next exploration: Native MJPEG from Reolink cameras
Goal: Achieve <1s latency by bypassing HLS segmentation entirely
Successfully implemented direct MJPEG streaming from Reolink cameras to browser, bypassing FFmpeg entirely. Achieved sub-second latency (~200-400ms) by polling camera’s Snap API and serving multipart/x-mixed-replace stream. Implementation complete but requires optimization for multi-client support.
Reduce Reolink camera streaming latency below the 1.8s achieved with LL-HLS by eliminating FFmpeg transcoding and HLS segmentation overhead.
Stream Flow:
Camera Snap API (HTTP) → Python Generator (Flask) → Browser <img> tag
Latency: ~200-400ms (vs 1.8s with LL-HLS)
Key Design Decisions:
/cgi-bin/api.cgi?cmd=Snap endpoint at configurable FPS (default 10)multipart/x-mixed-replace MJPEG stream directly to browser<img> element instead of <video> element1. app.py - New Flask Route
@app.route('/api/reolink/<camera_id>/stream/mjpeg')
def api_reolink_stream_mjpeg(camera_id):
requests.Session() for connection reuseREOLINK_API_USER / REOLINK_API_PASSWORDDependencies Added:
requirements.txt: Added requests library2. stream_manager.py - MJPEG Skip Logic (line ~347)
if protocol == 'MJPEG':
logger.info(f"Camera {camera_name} uses MJPEG snap proxy - skipping FFmpeg stream startup")
return None
stream_type == "MJPEG"3. mjpeg-stream.js - Camera Type Routing (line 14-23)
async startStream(cameraId, streamElement, cameraType) {
if (cameraType === 'reolink') {
mjpegUrl = `/api/reolink/${cameraId}/stream/mjpeg?t=${Date.now()}`;
} else if (cameraType === 'unifi') {
mjpegUrl = `/api/unifi/${cameraId}/stream/mjpeg?t=${Date.now()}`;
}
cameraType parameter (required)/api/reolink/ or /api/unifi/ based on camera type4. stream.js - 5 Locations Updated
cameraType parameter to mjpegManager.startStream() call'MJPEG' to condition alongside 'mjpeg_proxy''MJPEG' to health monitor attachment'MJPEG' to stopIndividualStream()'MJPEG' to restartStream()5. streams.html - Template Update (line 76)
{% if info.stream_type == 'MJPEG' or info.stream_type == 'mjpeg_proxy' %}
<img class="stream-video" style="object-fit: cover; width: 100%; height: 100%;" alt="MJPEG Stream">
{% else %}
<video class="stream-video" muted playsinline></video>
{% endif %}
'MJPEG' stream type uses <img> element (critical for multipart streams)6. cameras.json - New Configuration Section
"stream_type": "MJPEG",
"mjpeg_snap": {
"enabled": true,
"width": 640,
"height": 480,
"fps": 10,
"timeout_ms": 5000,
"snap_type": "sub"
}
Parameters:
enabled: Toggle MJPEG modewidth/height: JPEG resolution (min 640x480 per Reolink API)fps: Polling rate (10 = 100ms interval)timeout_ms: HTTP request timeoutsnap_type: “sub” (substream) or “main” (mainstream)7. AWS Secrets Manager - New Credentials
push_secret_to_aws REOLINK_CAMERAS '{"REOLINK_USERNAME":"admin","REOLINK_PASSWORD":"xxx","REOLINK_API_USER":"api-user","REOLINK_API_PASSWORD":"RataMinHa5564"}'
TarTo56))#FatouiiDRtu) caused URL encoding issues with Reolink APIREOLINK_API_* first, falls back to REOLINK_*Problem: Main Reolink password contains special characters ))# that broke API authentication when URL-encoded
Error: "invalid user", rspCode: -27
URL: ...&password=TarTo56%29%29%23FatouiiDRtu...
Solution: Created dedicated API user with simple password (api-user / RataMinHa5564)
cameraType ParameterError: Unsupported camera type for MJPEG: undefined
Root Cause: stream.js wasn’t passing cameraType to mjpegManager.startStream()
Fix: Added third parameter to call (line 298)
Error: MJPEG stream failed to load (using <video> instead of <img>)
Root Cause: streams.html only checked for 'mjpeg_proxy', not 'MJPEG'
Fix: Updated Jinja2 condition to include both stream types
Symptom: Backend fetching 141-byte responses instead of 45KB JPEGs Cause: Invalid credentials causing JSON error response Resolution: Fixed credentials, confirmed 45KB JPEGs at 10 FPS
Latency Comparison:
| Method | Latency | Status | Notes |
|---|---|---|---|
| Direct RTSP → LL-HLS | 1.8s | ✅ | Previous best |
| MJPEG Snap Polling | ~200-400ms | ✅ | New implementation |
Bandwidth (640x480 @ 10 FPS):
Backend Performance:
[MJPEG] Frame fetch: HTTP 200, size=45397 bytes (frame 1)
[MJPEG] Frame fetch: HTTP 200, size=45322 bytes (frame 2)
[MJPEG] Frame fetch: HTTP 200, size=45251 bytes (frame 3)
Current Behavior: Each browser client creates a separate generator thread Issue: N clients = N camera connections = resource multiplication Impact:
Required Fix: Implement single-capture, multi-client architecture like unifi_mjpeg_capture_service.py
# Pattern from UniFi MJPEG implementation:
class UNIFIMJPEGCaptureService:
- Single capture thread per camera
- Shared frame buffer
- Client count tracking
- Automatic cleanup when last client disconnects
Implementation Plan:
reolink_unifi_mjpeg_capture_service.py (similar to UniFi version)Frontend Stream Type Detection:
if (streamType === 'MJPEG' || streamType === 'mjpeg_proxy') {
// Use MJPEG manager
}
Backend Stream Type Skip:
if protocol == 'MJPEG':
return None # Skip FFmpeg
Camera Type Routing:
if (cameraType === 'reolink') {
url = `/api/reolink/${id}/stream/mjpeg`;
} else if (cameraType === 'unifi') {
url = `/api/unifi/${id}/stream/mjpeg`;
}
requests library (Python) - HTTP client for camera API pollingStatus: ✅ Working with sub-second latency
Next Priority: Implement single-capture multi-client service to prevent resource multiplication
Performance: Excellent latency, needs optimization for scalability
Implemented single-capture, multi-client architecture for Reolink MJPEG streaming to prevent resource multiplication. Successfully deployed separate sub/main stream configurations for grid vs fullscreen modes. Discovered Reolink Snap API has ~1-2 FPS hardware limitation regardless of requested FPS.
Prevent N browser clients from creating N camera connections when viewing Reolink MJPEG streams. Implement quality switching between grid mode (low-res sub stream) and fullscreen mode (higher-res main stream).
Service Pattern:
Single Capture Thread → Shared Frame Buffer → Multiple Client Generators
- One camera connection regardless of viewer count
- Automatic cleanup when last client disconnects
- Thread-safe frame buffer with locking
Stream Quality Switching:
/api/reolink/<id>/stream/mjpeg → sub stream (640x480 @ 7 FPS)/api/reolink/<id>/stream/mjpeg/main → main stream (1280x720 @ 10 FPS requested)reolink_mjpeg_capture_service.py (renamed from /services/)
ReolinkMJPEGCaptureService classunifi_mjpeg_capture_service.py1. app.py - Two New Routes
Sub stream route (line ~788):
@app.route('/api/reolink/<camera_id>/stream/mjpeg')
def api_reolink_stream_mjpeg(camera_id):
mjpeg_snap['sub'] configsnap_type: 'sub'Main stream route (line ~830):
@app.route('/api/reolink/<camera_id>/stream/mjpeg/main')
def api_reolink_stream_mjpeg_main(camera_id):
mjpeg_snap['main'] config{camera_id}_main for separate capture process2. Service Integration
from services.reolink_mjpeg_capture_service import reolink_mjpeg_capture_servicecleanup_handler() (line 1032)3. stream.js - Fullscreen Route Update (line ~578)
if (cameraType === 'reolink') {
mjpegUrl = `/api/reolink/${serial}/stream/mjpeg/main?t=${Date.now()}`;
}
/main endpoint/mjpeg endpoint (sub stream)4. cameras.json - Nested Sub/Main Config
"mjpeg_snap": {
"sub": {
"enabled": true,
"width": 640,
"height": 480,
"fps": 7,
"timeout_ms": 5000
},
"main": {
"enabled": true,
"width": 1280,
"height": 720,
"fps": 10,
"timeout_ms": 8000
}
}
Key Changes:
sub/main objectsProblem: Service expects flat mjpeg_snap config but cameras.json has nested structure
Solution: Routes flatten config before passing to service:
# In app.py routes:
mjpeg_snap = camera.get('mjpeg_snap', {})
sub_config = mjpeg_snap.get('sub', mjpeg_snap) # Fallback for old format
camera_with_sub = camera.copy()
camera_with_sub['mjpeg_snap'] = sub_config
camera_with_sub['mjpeg_snap']['snap_type'] = 'sub'
reolink_mjpeg_capture_service.add_client(camera_id, camera_with_sub, camera_repo)
Width/Height Conditional:
# In reolink_mjpeg_capture_service.py _capture_loop:
snap_params = {
'cmd': 'Snap',
'channel': 0,
'user': capture_info['username'],
'password': capture_info['password']
}
# Only add width/height if specified (sub stream)
if capture_info['width'] and capture_info['height']:
snap_params['width'] = capture_info['width']
snap_params['height'] = capture_info['height']
Why: Initially tried omitting width/height for “native resolution” main stream, but Reolink API requires token-based auth without dimensions. Workaround: Always specify dimensions.
Symptom:
[REOLINK_OFFICE_main] Response too small (146 bytes)
Error: "please login first", rspCode: -6
Root Cause: Reolink Snap API authentication behavior differs based on parameters:
Solution: Always specify width/height dimensions even for main stream instead of implementing token auth.
Problem: Service expected camera['mjpeg_snap'] to be flat dict with width, height, fps, but cameras.json had nested sub/main structure.
Solution: Routes extract and flatten the appropriate config before passing to service. Service remains agnostic to nesting.
camera_with_sub Not DefinedError: NameError: name 'camera_with_sub' is not defined
Cause: Extracted sub_config but forgot to create modified camera dict before calling add_client()
Fix: Added camera copy and config assignment:
camera_with_sub = camera.copy()
camera_with_sub['mjpeg_snap'] = sub_config
camera_with_sub['mjpeg_snap']['snap_type'] = 'sub'
Config: 640x480 @ 7 FPS requested Actual: ~7 FPS achieved Frame Size: ~45 KB per frame Bandwidth: ~315 KB/s (~2.5 Mbps) Latency: ~200-400ms Status: ✅ Works well for grid thumbnails
Config: 1280x720 @ 10 FPS requested Actual: ~1-2 FPS achieved (hardware limitation) Frame Size: ~120-150 KB per frame Bandwidth: ~240 KB/s (~2 Mbps) Latency: ~200-400ms Status: ⚠️ Limited by Reolink Snap API hardware/firmware
Critical Finding: The Reolink Snap API has a hard limit of ~1-2 snapshots per second regardless of requested FPS. This is a hardware/firmware limitation of the snapshot encoding pipeline, separate from the RTSP streaming pipeline.
Testing Attempted:
Conclusion: Snap API not suitable for smooth video playback. Best use cases:
For users requiring smooth fullscreen video, a hybrid approach could be implemented:
Grid mode: MJPEG Snap (sub) - 640x480 @ 1-2 FPS Fullscreen: LL-HLS (main) - 1920x1080 @ 15-30 FPS
This would require modifying stream.js fullscreen logic to detect Reolink cameras and route to HLS instead of MJPEG:
if (streamType === 'MJPEG' && cameraType === 'reolink') {
// Use HLS for Reolink fullscreen (Snap API too slow)
const response = await fetch(`/api/stream/start/${serial}`, {
method: 'POST',
body: JSON.stringify({ type: 'main' })
});
// ... HLS setup
}
Decision: User opted to keep MJPEG for fullscreen at 1-2 FPS, suitable for security monitoring where smooth motion isn’t required.
# Add client (starts capture if first client)
reolink_mjpeg_capture_service.add_client(camera_id, camera_config, camera_repo)
# Remove client (stops capture if last client)
reolink_mjpeg_capture_service.remove_client(camera_id)
# Get latest frame from shared buffer
frame_data = reolink_mjpeg_capture_service.get_latest_frame(camera_id)
def generate():
try:
last_frame_number = -1
while True:
frame_data = service.get_latest_frame(camera_id)
if frame_data and frame_data['frame_number'] != last_frame_number:
yield mjpeg_frame(frame_data['data'])
last_frame_number = frame_data['frame_number']
time.sleep(0.033) # Check rate faster than capture rate
except GeneratorExit:
service.remove_client(camera_id)
# Support both new nested and old flat config structures
mjpeg_snap = camera.get('mjpeg_snap', {})
sub_config = mjpeg_snap.get('sub', mjpeg_snap) # Falls back to flat if no 'sub' key
Implementation: ✅ Complete and working Multi-client prevention: ✅ Verified working Quality switching: ✅ Sub for grid, main for fullscreen Performance: ⚠️ Limited by Snap API hardware (~1-2 FPS max) Stability: ✅ Stable, proper cleanup, no resource leaks
Recommendation: Current MJPEG implementation suitable for security monitoring use case where 1-2 FPS in fullscreen is acceptable. For users requiring smooth fullscreen video, implement hybrid HLS/MJPEG approach.
New:
reolink_mjpeg_capture_service.py (377 lines)Modified:
app.py - Added 2 routes (~75 lines total)stream.js - Updated fullscreen URL (1 line)cameras.json - Migrated to nested sub/main structureTesting Cameras:
Implemented Amcrest camera support with MJPEG streaming:
Backend Components Added:
services/credentials/amcrest_credential_provider.py - Per-camera credentials with generic fallbackservices/amcrest_mjpeg_capture_service.py - Continuous MJPEG stream parser using multipart/x-mixed-replacestreaming/handlers/amcrest_stream_handler.py - RTSP URL builder for Amcrest camerascamera_repository.py - Added get_amcrest_config() methodapp.py - Added /api/amcrest/<camera_id>/stream/mjpeg routes (sub and main)Frontend Updates:
mjpeg-stream.js - Added Amcrest camera type support with correct URL routingstream.js - Updated fullscreen handler to use substream for both grid and fullscreen (camera doesn’t support MJPEG on main stream)Key Implementation Details:
{CAMERA_ID}_USERNAME/PASSWORD with fallback to AMCREST_USERNAME/PASSWORDDiscovered Limitations:
Status: Fully functional. Grid view and fullscreen both working with substream quality.
Implemented comprehensive CSS modularization for better maintainability:
Original Monolithic Files Split:
streams.css (987 lines) → 9 modular componentssettings.css (323 lines) → 2 modular componentsheader_buttons.css (16 lines) → merged into buttons.cssNew Modular Structure Created:
static/css/
├── main.css (49 lines) - Orchestrator with correct cascade order
├── base/
│ └── reset.css (39 lines) - Global reset & body styles
└── components/
├── buttons.css (132 lines) - All button variants + header icon buttons
├── fullscreen.css (74 lines) - Fullscreen modal overlay
├── grid-container.css (54 lines) - Main streams container
├── grid-modes.css (73 lines) - Grid layouts (1-5) & attached mode
├── header.css (161 lines) - Fixed header & collapsible mechanism
├── ptz-controls.css (76 lines) - PTZ directional controls
├── responsive.css (34 lines) - Mobile & tablet media queries
├── settings-controls.css (166 lines) - Setting toggles, inputs, selects
├── settings-overlay.css (239 lines) - Settings modal structure
├── stream-controls.css (70 lines) - Stream control buttons
├── stream-item.css (117 lines) - Individual stream container + video
└── stream-overlay.css (127 lines) - Title, status indicators, loading
Total: 1,411 lines across 14 files (vs 1,326 original lines)
Separation of Concerns:
Key Benefits:
Import Order (Critical for Cascade):
Z-Index Hierarchy Documented:
No Breaking Changes:
<link rel="stylesheet" href="css/main.css">Documentation Created:
CSS_MODULARIZATION_README.md - Complete technical documentationFILE_TREE.txt - Visual structure with line countsProblem: MJPEG streams (Amcrest) didn’t fill the screen in fullscreen mode - constrained to 95% viewport with padding.
Root Cause:
fullscreen.css applied max-width/max-height constraints suitable for HLS video<img> tag (not <video>) due to multipart/x-mixed-replace formatstream.js was setting maxWidth: '95%', maxHeight: '95%', objectFit: 'contain'Solution:
/fullscreen-mjpeg.css with true fullscreen styling:
width: 100vw; height: 100vhobject-fit: cover (fills screen, crops to maintain aspect ratio).mjpeg-active classstream.js:
.mjpeg-active class toggle to overlaycloseFullscreen()main.cssTechnical Notes:
<video> only supports containerized formats (MP4, WebM, HLS)<img> tag for both continuous streams (Amcrest) and snapshot-based streams (Reolink)object-fit: cover chosen over contain to eliminate black barsObjective: Restore PTZ functionality for Amcrest cameras using CGI API.
Architecture:
Created new services/ptz/ directory with brand-specific handlers:
services/ptz/
├── __init__.py
├── amcrest_ptz_handler.py
└── ptz_validator.py (moved from services/)
API Discovery Process: Initial attempt used numeric direction codes (0, 2, 4, 5) - all returned 400 Bad Request.
Key Finding: Amcrest uses STRING-based codes, not numeric:
DIRECTION_CODES = {
'up': 'Up',
'down': 'Down',
'left': 'Left',
'right': 'Right'
}
Working Amcrest PTZ CGI Format:
http://{host}/cgi-bin/ptz.cgi?action=start&channel=0&code=Right&arg1=0&arg2=5&arg3=0
Parameters:
action: start or stopchannel: 0 (default)code: String direction or ‘Right’ (arbitrary for stop)arg1: Vertical speed/steps (0 = default)arg2: Horizontal speed (1-8, 5 = medium) CRITICAL: Must be >0 or camera won’t move!arg3: Reserved/unused (always 0)Authentication: HTTP Digest Auth via requests.HTTPDigestAuth
Backend Integration:
app.py PTZ route to dispatch by camera type:if camera_type == 'amcrest':
success = amcrest_ptz_handler.move_camera(camera_serial, direction, camera_repo)
elif camera_type == 'eufy':
success = eufy_bridge.move_camera(camera_serial, direction, camera_repo)
ptz_validator.py valid_directions listFrontend Integration Challenges:
Issue 1: PTZController not loading
stream.js - ptz-controller.js is in controllers/ subdirectoryimport { PTZController } from '../controllers/ptz-controller.js'Issue 2: Event listeners not firing
setupEventListeners() and debug logging into constructorIssue 3: Stop command not working
this.currentCamera was null - stop returns immediately.closest('.stream-item')Final PTZ Event Flow:
/api/ptz/{serial}/{direction}Testing:
# All return "OK" and camera moves
curl --digest -u "admin:password" "http://192.168.10.34/cgi-bin/ptz.cgi?action=start&channel=0&code=Right&arg1=0&arg2=5&arg3=0"
curl --digest -u "admin:password" "http://192.168.10.34/cgi-bin/ptz.cgi?action=stop&channel=0&code=Right&arg1=0&arg2=0&arg3=0"
Critical Issues:
Next Steps - ONVIF Integration:
Objective: Implement preset support and unified PTZ control via ONVIF protocol
Why ONVIF:
Proposed Architecture:
services/onvif/
├── __init__.py
├── onvif_client.py # Core connection/auth wrapper
├── onvif_discovery.py # Network discovery service
├── onvif_ptz_manager.py # PTZ ops (presets, move, zoom)
└── onvif_capability_detector.py # Feature detection per camera
Library: onvif-zeep (Python 3 compatible ONVIF client)
ONVIF PTZ Operations:
GetPresets(ProfileToken) → List available presetsGotoPreset(ProfileToken, PresetToken, Speed) → Move to presetSetPreset(ProfileToken, PresetName) → Create/update presetContinuousMove(), AbsoluteMove(), RelativeMove() - Movement APIsImplementation Plan:
GET /api/ptz/{camera}/presetsPOST /api/ptz/{camera}/preset/{id}Camera Compatibility Research:
Frontend Enhancements Needed:
Docker Hot-Reload Issues:
./:/app should work but had persistent caching issuesdocker-compose down -v && docker system prune -fPython Output Buffering:
logger.info() doesn’t show immediately in Docker logsprint() with flush=True or PYTHONUNBUFFERED=1 env varPYTHONUNBUFFERED=1 to docker-compose.yml environmentjQuery Event Delegation:
$('.ptz-btn').on() can fail if elements re-rendered$(document).on('event', '.selector', handler)Amcrest API Quirks:
File Organization:
services/{feature}/{brand}_handler.py patternObjective: Add PTZ controls to fullscreen mode so users can control camera movement while viewing fullscreen.
Architecture:
streams.html/static/css/components/fullscreen-ptz.css for overlay stylingstream.js openFullscreen() to show/hide PTZ based on camera capabilitiesKey Files Modified:
streams.html: Added #fullscreen-ptz div with PTZ button gridstatic/css/components/fullscreen-ptz.css: Positioned bottom-right, semi-transparent backgroundstatic/css/main.css: Added import for fullscreen-ptz.cssstatic/js/streaming/stream.js: PTZ visibility logic in openFullscreen()static/js/controllers/ptz-controller.js: Camera detection logic updatedIssue 1: PTZ controls not appearing in fullscreen
Root Cause: getCameraConfig() returns a Promise but wasn’t awaited, so config?.capabilities was undefined.
Solution:
// In stream.js openFullscreen()
const config = await this.getCameraConfig(cameraId); // Added await
const hasPTZ = config?.capabilities?.includes('ptz');
Issue 2: “Camera undefined not found” errors
Root Cause: PTZ event handlers tried to detect camera from .closest('.stream-item'), which doesn’t exist in fullscreen overlay.
Solution: Modified ptz-controller.js setupEventListeners() to only auto-detect camera if this.currentCamera is not already set:
if (!this.currentCamera) {
const $streamItem = $(event.currentTarget).closest('.stream-item');
// ... detect camera from stream-item
}
In fullscreen, camera is set by openFullscreen() before showing controls.
Issue 3: Slow stop response - camera continues moving after button release
Root Cause: mouseup event not firing because button gets disabled during movement.
In updateButtonStates():
const enabled = this.bridgeReady && this.currentCamera && !this.isExecuting;
$('.ptz-btn').prop('disabled', !enabled); // Disables button while isExecuting=true
When user presses button → isExecuting=true → button disabled → mouseup never fires.
Solution: Removed !this.isExecuting check from button disable logic:
updateButtonStates() {
const enabled = this.bridgeReady && this.currentCamera; // Removed !this.isExecuting
$('.ptz-btn').prop('disabled', !enabled);
}
Side benefit: mouseleave event now provides instant stop when user drags mouse away while holding button, improving UX.
PTZ overlay positioned bottom-right with:
rgba(0, 0, 0, 0.7)Working:
Tested Cameras:
Event Handling Pattern:
.stream-item on each button pressopenFullscreen(), reused for all button pressesZ-Index Stack:
CSS Organization: All fullscreen-related CSS in dedicated files:
fullscreen.css - Base overlay and videofullscreen-mjpeg.css - MJPEG-specific stylingfullscreen-ptz.css - PTZ controls overlayThe application had a basic fullscreen overlay system using a separate #fullscreen-overlay div with its own video element. When users entered fullscreen, the video stream would be cloned to this overlay. However, there was no persistence mechanism - fullscreen state was lost on page reload (critical for the 1-hour auto-reload timer).
Approach: Attempted to use browser’s native Fullscreen API (element.requestFullscreen()) with localStorage persistence.
Implementation Steps:
openFullscreen() to use native API instead of overlayrestoreFullscreenFromLocalStorage() to auto-restore after reloadfullscreen-handler.jsBlocker Encountered: Browser security restrictions prevent calling requestFullscreen() without a direct user gesture. Attempted workarounds:
.click() on fullscreen buttonResult: None of the workarounds succeeded. The user gesture context is lost after async operations, and programmatic clicks don’t count as real user gestures. Native fullscreen API fundamentally incompatible with auto-restore requirement.
After multiple failed attempts, user proposed: “We could implement our own fullscreen: have a fullscreen container ready to replace the entire page content”
This insight led to abandoning native browser fullscreen in favor of CSS-based approach.
Architecture:
.css-fullscreen class to target .stream-item
position: fixed; top: 0; left: 0; width: 100vw; height: 100vh; z-index: 9999object-fit: contain to maintain aspect ratio:has() selectorImplementation Phases:
Phase 1: CSS & Core Methods
static/css/components/fullscreen.css for .css-fullscreen stylingopenFullscreen() to add CSS class instead of calling browser APIcloseFullscreen() to remove CSS class and restart stopped streamsPhase 2: Auto-Restore Logic
restoreFullscreenFromLocalStorage() - no user gesture needed!init() after DOM readystartAllStreams() promise completion (problematic)Phase 3: Bug Fixes & Optimization
Bug #1: Multiple Event Handlers (3x Button Clicks)
stream.js bottom also had $(document).ready() creating instance$._data($('#streams-container')[0], 'events') to confirm single handlerBug #2: Exit Then Immediate Re-Entry
.click.fullscreen) and .off() to remove existing handlers before attaching.off() after fixing root cause, added documentation commentsBug #3: Auto-Restore Not Working
init() waited for startAllStreams() to complete via .then() chainstartAllStreams() fires in background (no await in init)restoreFullscreenFromLocalStorage() called after 1-second setTimeout (just needs DOM ready)Phase 4: Cleanup
#fullscreen-overlay HTML structure (138-163 lines in streams.html)stream.jsfullscreen-handler.js camera localStorage logic (separation of concerns)fullscreen-handler.js = page-level fullscreen, stream.js = camera fullscreenPhase 5: Pause/Resume Optimization
Problem Discovered: After implementing CSS fullscreen with stream stop/restart logic, streams failed to restart properly when exiting fullscreen:
HLS fatal error: manifestLoadError with 404 responsesSolution: Pause Instead of Stop Leveraged HLS.js built-in pause/resume API instead of destroy/recreate cycle:
For HLS Streams:
hls.stopLoad() + video.pause() - stops network and decoder, keeps instance alivehls.startLoad() + video.play() - resumes from where it left offFor RTMP Streams:
video.pause() - stops decodervideo.play() - instant resumeFor MJPEG Streams:
img.src, then clear it to stop fetchingimg._pausedSrc to resumeBenefits of Pause/Resume Approach:
Implementation Details:
this.streamsBeforeFullscreen array with this.pausedStreams arraythis.hlsManager.hlsInstances.get(id) mapstartStream() or stopIndividualStream() methodsTesting Results:
Key Insight:
HLS.js already had the perfect API for this use case - stopLoad()/startLoad() - which pauses network activity while keeping the player instance and state intact. The initial stop/restart approach was over-engineered and created unnecessary complexity.
The CSS fullscreen system is now complete and production-ready:
Performance Metrics:
Module Self-Instantiation (Correct Pattern):
// stream.js (bottom of file)
$(document).ready(() => {
new MultiStreamManager();
});
HTML Just Imports (Correct Pattern):
<script type="module" src="{{ url_for('static', filename='js/streaming/stream.js') }}"></script>
Anti-Pattern (DO NOT DO):
<!-- BAD - Creates duplicate instance -->
<script type="module">
import { MultiStreamManager } from '/static/js/streaming/stream.js';
new MultiStreamManager();
</script>
static/js/streaming/stream.js: Complete rewrite of fullscreen methods, init() refactorstatic/css/components/fullscreen.css: Complete rewrite for CSS approachtemplates/streams.html: Removed old overlay HTML, removed duplicate instantiationstatic/js/settings/fullscreen-handler.js: Reverted camera-specific changes$._data(element, 'events') is invaluable for finding duplicate listenersBefore: Fullscreen state lost on reload, requiring manual re-selection every hour. Multiple clicks sometimes needed to exit fullscreen.
After: Seamless fullscreen persistence across reloads. Single-click enter/exit. Significant performance improvement when viewing single camera. Professional app-like experience.
Critical architectural improvement to enable proper multi-user streaming support. Previously, stopping streams involved backend /api/stream/stop/ calls which created fundamental problems:
The correct multi-user architecture:
Additional benefits:
fetch('/api/stream/stop/${cameraId}')hls.stopLoad() + videoEl.pause()fetch('/api/streams/stop-all')stopLoad() + pause()/api/stream/stop/)startStream() (which handles backend start + reattach)// NOT CURRENTLY IN USE comment)Stop Operation Pattern (client-side only):
// HLS streams
hls.stopLoad(); // Stop fetching segments
videoEl.pause(); // Stop video decoder
hls.destroy(); // Cleanup HLS instance
// MJPEG streams
imgEl.src = ''; // Clear source stops fetching
// FLV streams
flvPlayer.destroy(); // Destroys player instance
mjpeg-stream.js - Always been client-side onlyflv-stream.js - Always been client-side onlystream.js - Uses hls.stopLoad() + pause() pattern in fullscreen logic/api/stream/start/static/js/streaming/hls-stream.js (path: /home/elfege/0_NVR/static/js/streaming/hls-stream.js)stream.js fullscreen handler - now applied consistently across all managersCompleted ONVIF protocol integration for PTZ camera control and preset management. Previously relied on vendor-specific CGI APIs (Amcrest) which limited flexibility. ONVIF provides standardized control across camera vendors with full preset support.
1. Camera Selection Bug (Frontend)
if (!this.currentCamera) guard prevented camera updates2. Credential Provider Integration (Backend)
camera_config['username'] directly_get_credentials() with AmcrestCredentialProvider/ReolinkCredentialProvidermove_camera(), get_presets(), goto_preset(), set_preset(), remove_preset()3. WSDL Path Configuration
/etc/onvif/wsdl/ (incorrect)/usr/local/lib/python3.11/site-packages/wsdl/ONVIFClient.WSDL_DIR constant to correct pathno_cache=True parameter to prevent permission errors on /home/appuser writes4. ONVIF Port Configuration
onvif_port field to camera configs with fallback to DEFAULT_PORT = 805. SOAP Type Creation Issues
ptz_service.create_type('PTZSpeed') which failedPTZSpeed, Vector2D, Vector1D are schema types, not service types{'PanTilt': {'x': speed, 'y': speed}}PTZ Request Flow:
Frontend (ptz-controller.js)
↓
Flask API (/api/ptz/<serial>/<direction>)
↓
ONVIF Handler (priority) → Credential Provider → ONVIF Client → Camera
↓ (fallback for Amcrest)
CGI Handler → Credential Provider → HTTP Request → Camera
Vendor-Specific Behavior:
Backend:
services/onvif/onvif_ptz_handler.py - All 5 methods updated for credential providers + dictionary velocityservices/onvif/onvif_client.py - Fixed WSDL_DIR path, added no_cache, reordered parametersapp.py - ONVIF-first routing with CGI fallback for AmcrestFrontend:
static/js/controllers/ptz-controller.js - Fixed camera detection logic in mousedown/mouseup handlersConfig:
config/cameras.json - Added "onvif_port": 8000 for Reolink camerasONVIF vs CGI:
Decision: Keep ONVIF-first for consistency, CGI fallback provides speed when needed
Why Dictionary Approach for SOAP Types:
# ❌ FAILS - Can't create schema types via service
request.Velocity = ptz_service.create_type('PTZSpeed')
# ✅ WORKS - Zeep auto-converts dicts to SOAP types
request.Velocity = {'PanTilt': {'x': 0.5, 'y': 0.5}}
WSDL Location Discovery:
# Find onvif package location
python3 -c "import onvif; print(onvif.__file__)"
# /usr/local/lib/python3.11/site-packages/onvif/__init__.py
# Check default wsdl_dir parameter
python3 -c "from onvif import ONVIFCamera; help(ONVIFCamera.__init__)"
# wsdl_dir='/usr/local/lib/python3.11/site-packages/wsdl'
PTZ controls disappeared in fullscreen mode after CSS fullscreen refactoring. Controls worked in grid view but were hidden when entering fullscreen.
In fullscreen.css, the rule .stream-item.css-fullscreen .stream-controls { display: none !important; } was hiding the entire .stream-controls container, which includes:
.control-row (play/stop/refresh buttons).ptz-controls (PTZ directional buttons and presets)The CSS had proper PTZ positioning rules (lines 82-100) but the parent container was hidden.
Commented out the blanket hide rule in fullscreen.css line 103-105. All controls now remain visible in fullscreen mode:
PTZ controls now work in fullscreen for both HLS and MJPEG streams. Camera control maintained across grid ↔ fullscreen transitions without losing selected camera context.
Comprehensive investigation and fix of UI health monitoring system failures. Health monitor was failing to detect and recover from stale/frozen streams due to multiple critical bugs in the restart and attachment lifecycle. Cameras would get stuck in “Restart failed” state with no automatic recovery, requiring manual user intervention.
1. Inconsistent Naming: serial vs cameraId
Root Cause:
During initial health monitor fixes, parameter name was changed from cameraId to serial in multiple locations, but this was inconsistent with the rest of the codebase which universally uses cameraId as the camera identifier.
// health.js passes 'serial'
await this.opts.onUnhealthy({ serial, reason, metrics });
// stream.js expects 'cameraId'
onUnhealthy: async ({ cameraId, reason, metrics }) => { ... }
// openFullscreen() uses undefined 'serial'
const $streamItem = $(`.stream-item[data-camera-serial="${serial}"]`); // ReferenceError!
Impact:
2. Parameter Name Mismatch in Health Callback (Original Bug)
2. Parameter Name Mismatch in Health Callback (Original Bug)
Root Cause:
// health.js:108 - initially passed just 'serial'
await this.opts.onUnhealthy({ serial, reason, metrics });
// stream.js:47 - expected 'cameraId' but got undefined
onUnhealthy: async ({ cameraId, reason, metrics }) => {
// cameraId was undefined because health.js passed 'serial'
}
serial but callback destructured cameraIdcameraId = undefined in all callback code$(.stream-item[data-camera-serial=”undefined”]) found nothingInitial incorrect fix attempted: Changed callback to use serial everywhere, but this broke other code
Correct fix: Changed health.js to pass { cameraId: serial, ... } so callback receives correct parameter name
3. MJPEG Health Attachment Missing Null Check
// Line ~404 - HLS and RTMP check this.health
} else if (streamType === 'RTMP' && this.health) { ... }
// MJPEG branch missing check
} else if (streamType === 'MJPEG' || streamType === 'mjpeg_proxy') {
el._healthDetach = this.health.attachMjpeg(cameraId, el); // Fails if this.health is null
}
.attachMjpeg() on null when health monitoring disabled4. Health Monitor Never Reattached After Failed Restart
Flow:
Health detects stale → schedules restart
↓
restartStream() called → DETACHES health monitor
↓
forceRefreshStream() throws network error
↓
Catch block sets status to 'Restart failed'
↓
Health monitor NEVER REATTACHES ❌
↓
Camera stuck forever - no more retries possible
startStream() success pathforceRefreshStream() failed in restartStream(), error caught before reattachment5. Health Monitor Never Attached After Initial Startup Failure
startStream() catch block set status to ‘error’ but didn’t attach health6. Health Monitor Not Reattached After Successful Restart
// restartStream() for HLS - line ~503
await this.hlsManager.forceRefreshStream(cameraId, videoElement);
this.setStreamStatus($streamItem, 'live', 'Live');
// Missing: health reattachment!
7. Health Monitors Not Detached During Fullscreen
Root Cause: When entering fullscreen mode, streams are paused (client-side only) but health monitors remain attached:
// openFullscreen() - pauses streams
hls.stopLoad(); // Stop fetching
videoEl.pause(); // Stop decoder
// BUT: Health monitor still sampling frames every 6 seconds!
What happens:
Enter fullscreen → Pause 11 background cameras
↓
6 seconds later: Health detects all 11 as STALE (no new frames)
↓
Health schedules restart for all 11 cameras
↓
Unwanted restart attempts on intentionally paused streams!
↓
Fullscreen camera working fine but system trying to "fix" paused cameras
Impact:
8. Code Duplication for Health Attachment
Health attachment logic repeated in 3 locations (~12 lines each):
startStream() success blockrestartStream() success blockrestartStream() catch block (after fixes)Violated DRY principle, increased maintenance burden.
1. Naming Consistency: cameraId Throughout
health.js fix:
// Changed from passing 'serial' to passing 'cameraId'
await this.opts.onUnhealthy({ cameraId: serial, reason, metrics });
stream.js openFullscreen() fix:
// Changed from undefined 'serial' to 'cameraId'
const $streamItem = $(`.stream-item[data-camera-serial="${cameraId}"]`);
stream.js attachHealthMonitor() fix:
// Changed parameter from 'serial' to 'cameraId'
attachHealthMonitor(cameraId, $streamItem, streamType) {
console.log(`[Health] Monitoring disabled for ${cameraId}`);
// ... all references use 'cameraId'
}
2. Parameter Name Consistency in Health Callback
2. Parameter Name Consistency in Health Callback
Ensured all references in onUnhealthy callback use cameraId consistently (13 total references):
onUnhealthy: async ({ cameraId, reason, metrics }) => {
console.warn(`[Health] Stream unhealthy: ${cameraId}, reason: ${reason}`, metrics);
const $streamItem = $(`.stream-item[data-camera-serial="${cameraId}"]`);
const attempts = this.restartAttempts.get(cameraId) || 0;
// ... all 13 references use 'cameraId'
this.restartAttempts.set(cameraId, attempts + 1);
await this.restartStream(cameraId, $streamItem);
}
Note: The naming convention is cameraId throughout stream.js, while health.js internally uses serial but passes it as cameraId: serial to maintain consistency with the rest of the codebase.
3. MJPEG Null Check Added
} else if ((streamType === 'MJPEG' || streamType === 'mjpeg_proxy') && this.health) {
el._healthDetach = this.health.attachMjpeg(cameraId, el);
}
4. Extracted Reusable attachHealthMonitor() Method
New centralized method for health attachment:
/**
* Attach health monitor to a stream element
* Centralizes health attachment logic to avoid repetition
*/
attachHealthMonitor(serial, $streamItem, streamType) {
if (!this.health) {
console.log(`[Health] Monitoring disabled for ${serial}`);
return;
}
const el = $streamItem.find('.stream-video')[0];
if (!el) {
console.warn(`[Health] No video element found for ${serial}`);
return;
}
console.log(`[Health] Attaching monitor for ${serial} (${streamType})`);
if (streamType === 'HLS' || streamType === 'LL_HLS' || streamType === 'NEOLINK' || streamType === 'NEOLINK_LL_HLS') {
const hls = this.hlsManager?.hlsInstances?.get?.(serial) || null;
el._healthDetach = this.health.attachHls(serial, el, hls);
} else if (streamType === 'RTMP') {
const flv = this.flvManager?.flvInstances?.get?.(serial) || null;
el._healthDetach = this.health.attachRTMP(serial, el, flv);
} else if (streamType === 'MJPEG' || streamType === 'mjpeg_proxy') {
el._healthDetach = this.health.attachMjpeg(serial, el);
}
}
5. Health Reattachment in All Restart Paths
startStream() catch block:
} catch (error) {
$loadingIndicator.hide();
this.setStreamStatus($streamItem, 'error', 'Failed');
this.updateStreamButtons($streamItem, false);
console.error(`Stream start failed for ${cameraId}:`, error);
// Attach health even on initial failure
this.attachHealthMonitor(cameraId, $streamItem, streamType);
}
restartStream() catch block:
} catch (e) {
console.error(`[Restart] ${serial}: Failed`, e);
this.setStreamStatus($streamItem, 'error', 'Restart failed');
// Reattach health even on failure so it can retry
this.attachHealthMonitor(serial, $streamItem, streamType);
}
restartStream() success paths:
// After HLS restart
await this.hlsManager.forceRefreshStream(cameraId, videoElement);
this.setStreamStatus($streamItem, 'live', 'Live');
this.attachHealthMonitor(cameraId, $streamItem, streamType); // NEW
// After RTMP restart
if (ok && el && el.readyState >= 2 && !el.paused) {
this.setStreamStatus($streamItem, 'live', 'Live');
this.attachHealthMonitor(cameraId, $streamItem, streamType); // NEW
}
// MJPEG restart calls startStream() which already attaches health
6. Health Monitor Detach/Reattach During Fullscreen
openFullscreen() - detach health for paused streams:
// After pausing each stream type
if (hls && videoEl) {
hls.stopLoad();
videoEl.pause();
// Detach health monitor for paused stream
if (videoEl._healthDetach) {
videoEl._healthDetach();
delete videoEl._healthDetach;
}
this.pausedStreams.push({ id, type: 'HLS' });
}
// Same pattern for RTMP and MJPEG
closeFullscreen() - reattach health for resumed streams:
// After resuming each stream type
if (hls && videoEl) {
hls.startLoad();
videoEl.play().catch(e => console.log(`Play blocked: ${e}`));
// Reattach health monitor
this.attachHealthMonitor(stream.id, $item, streamType);
}
// Same pattern for RTMP and MJPEG
Benefits:
7. Stream-Specific Restart Methods Extracted
Created dedicated methods for cleaner separation:
async restartHLSStream(cameraId, videoElement)
async restartMJPEGStream(cameraId, $streamItem, cameraType, streamType)
async restartRTMPStream(cameraId, $streamItem, cameraType, streamType)
8. Enhanced Documentation
Added comprehensive JSDoc to restartStream():
/**
* Restart a stream that has become unhealthy or frozen
*
* This method is typically called by the health monitor when a stream is detected
* as stale (no new frames) or displaying a black screen. It handles the complete
* restart lifecycle:
*
* 1. Detaches health monitor to prevent duplicate monitoring during restart
* 2. Dispatches to stream-type-specific restart method (HLS/MJPEG/RTMP)
* 3. Updates UI status to 'live' on success
* 4. Reattaches health monitor (whether success or failure)
*
* The health monitor is ALWAYS reattached after restart (success or failure) to
* ensure continuous monitoring and automatic retry attempts.
*/
9. Configurable Max Restart Attempts
Added: UI_HEALTH_MAX_ATTEMPTS configuration option in cameras.json:
"ui_health_global_settings": {
"UI_HEALTH_MAX_ATTEMPTS": 10 // 0 = infinite (not recommended)
}
Implementation:
const maxAttempts = H.maxAttempts ?? 10; // Default to 10
// Check if max attempts reached (skip check if maxAttempts is 0)
if (maxAttempts > 0 && attempts >= maxAttempts) {
console.error(`[Health] ${cameraId}: Max restart attempts (${maxAttempts}) reached`);
this.setStreamStatus($streamItem, 'failed', `Failed after ${maxAttempts} attempts`);
return;
}
Behavior:
UI_HEALTH_MAX_ATTEMPTS: 10 → Stops after 10 restart attempts (recommended)UI_HEALTH_MAX_ATTEMPTS: 0 → Infinite attempts with ~120s intervals after attempt 5 (60s cooldown + 60s exponential backoff cap)Rationale: Allows operators to choose between eventual failure acknowledgment (safer) vs persistent retry (for cameras with intermittent connectivity). The 0 (infinite) option useful for cameras that experience long outages but eventually recover (e.g., power cycling, network maintenance).
Correct Flow:
Stream starts → Health attaches
↓
Health detects issue → Schedules restart
↓
restartStream() begins → Detaches health (prevent duplicates)
↓
Attempt restart (may succeed or fail)
↓
ALWAYS reattach health (success or failure)
↓
If failed: Health detects again → Next retry with exponential backoff
↓
Continues up to 10 attempts
Key Principle: Health monitor must ALWAYS reattach after restart, regardless of outcome. This ensures continuous monitoring and automatic recovery attempts.
Backend: None (all fixes frontend)
Frontend:
static/js/streaming/health.js - Changed callback parameter from serial to cameraId: serial for consistencystatic/js/streaming/stream.js - All health attachment, restart, and fullscreen logic
serial → cameraId in 3 locations)attachHealthMonitor() methodConfig:
config/cameras.json - Added UI_HEALTH_MAX_ATTEMPTS to ui_health_global_settings (default: 10, 0 = infinite)✅ All streams get health monitoring on startup:
[Health] Attaching monitor for REOLINK_LAUNDRY (LL_HLS)
[Health] Attached monitor for T8416P0023370398
[Health] Attaching monitor for AMCREST_LOBBY (MJPEG)
✅ Health detection working across all stream types:
[Health] T8416P0023370398: STALE - No new frames for 6.0s
[Health] Stream unhealthy: T8416P0023370398, reason: stale
✅ Automatic restart with proper exponential backoff:
[Health] T8416P0023370398: Scheduling restart 1/10 in 5s
[Health] T8416P0023370398: Executing restart attempt 1
[Health] T8416P0023370398: Scheduling restart 2/10 in 10s
[Health] T8416P0023370398: Scheduling restart 3/10 in 20s
✅ Health reattaches after restart (success or failure):
[Restart] T8416P0023370398: Beginning restart sequence
[Health] Detached monitor for T8416P0023370398
[Health] Attaching monitor for T8416P0023370398 (LL_HLS)
[Health] Attached monitor for T8416P0023370398
[Restart] T8416P0023370398: Restart complete
✅ Multiple cameras can restart independently:
[Health] T8441P12242302AC: STALE - No new frames for 6.0s
[Health] Stream unhealthy: T8441P12242302AC, reason: stale
[Health] T8441P12242302AC: Scheduling restart 1/10 in 5s
[Health] T8441P12242302AC: Executing restart attempt 1
[Restart] T8441P12242302AC: Restart complete
✅ Cameras no longer stuck in permanent failure states ✅ MJPEG cameras properly monitored ✅ Initial startup failures get automatic retry ✅ Status updates correctly to “Live” after successful restart ✅ Fullscreen functionality restored (naming consistency fix) ✅ Health monitors properly detach during fullscreen ✅ No false STALE warnings for paused background streams ✅ Health monitors reattach when exiting fullscreen
Reliability Improvements:
Code Quality:
cameraId universally used)User Experience:
Naming Convention: The codebase universally uses cameraId as the camera identifier throughout all modules. This corresponds to the camera’s serial number in most cases, but is consistently referred to as cameraId in code for clarity. The term “serial” should only appear in data attributes (data-camera-serial) and when interfacing with the health.js internal implementation.
Debugging Process: Initial fix attempt incorrectly changed callback parameters to use serial instead of cameraId, which caused ReferenceError: serial is not defined throughout the callback body. The correct solution was to have health.js pass { cameraId: serial, reason, metrics } while keeping all references in stream.js as cameraId. This maintains naming consistency across the codebase.
Hardware Issue Identified: Camera T8416P0023370398 (Kids Room) frequently drops connection despite being 2m from UAP. Suspected hardware defect rather than software issue, as identical models work fine. Camera locked to single UAP in UniFi to prevent roaming issues, but still experiences periodic disconnects requiring power cycle. During testing, this camera required 3 automatic restart attempts before successfully reconnecting, demonstrating the exponential backoff system working correctly (5s, 10s, 20s delays).
No Backend Stop API Calls: Verified UI never makes /api/stream/stop/ calls. All “stop” operations are client-side only (HLS.js stopLoad()/destroy(), MJPEG img.src = '', FLV destroy()). This prevents multiple UI clients from interfering with each other’s streams.
Fullscreen Performance: During fullscreen viewing, only the active camera maintains health monitoring. Background streams are paused and their health monitors detached to conserve resources and prevent false alerts. Health monitors automatically reattach when exiting fullscreen.
Retry Timing Mechanics: Health monitoring uses two separate timing mechanisms: (1) Exponential backoff for scheduled restart delays (5s, 10s, 20s, 40s, 60s max), and (2) 60-second cooldown period after each onUnhealthy trigger. For persistently failed cameras, the combined effect results in ~120-second intervals between restart attempts once exponential backoff reaches the cap (attempt 5+). This prevents overwhelming both the client and backend while still providing reasonable recovery attempts.
Issue #1: Infinite Retry Configuration Not Working
Despite setting UI_HEALTH_MAX_ATTEMPTS: 0 in cameras.json (line 2122) to enable infinite retry attempts, cameras were still showing “Failed after 10 attempts” status. Investigation revealed a configuration mapping gap preventing the setting from reaching the frontend.
Issue #2: Health Monitor Restart Failures vs Manual Success
Health monitor automatic restarts were consistently failing for certain cameras, yet manual refresh (clicking the refresh button) would immediately fix the same streams. This indicated a fundamental difference between the automatic and manual recovery paths that went beyond simple timing issues.
Configuration Issue:
The _ui_health_from_env() function in app.py (lines 1427-1469) was mapping all UI health settings from cameras.json to the frontend EXCEPT UI_HEALTH_MAX_ATTEMPTS:
key_mapping = {
'UI_HEALTH_ENABLED': 'uiHealthEnabled',
'UI_HEALTH_SAMPLE_INTERVAL_MS': 'sampleIntervalMs',
# ... 6 other mappings ...
# ❌ MISSING: 'UI_HEALTH_MAX_ATTEMPTS': 'maxAttempts'
}
Result: Frontend stream.js line 55 always defaulted to 10:
const maxAttempts = H.maxAttempts ?? 10; // Always 10, never 0
Recovery Failure Root Cause:
Through systematic debugging using browser console diagnostics, logs revealed the actual failure sequence:
forceRefreshStream() → calls backend /api/stream/start/T8416P0023370398"Stream already active for T8416P0023370398" (doesn’t verify FFmpeg health)/api/streams/T8416P0023370398/playlist.m3u8manifestLoadError - 404 Not FoundWhy Manual Refresh Works:
Manual refresh clicked later (after multiple failures) works because:
Key Insight: The health monitor was performing identical operations to manual refresh, but the backend’s “already active” check was preventing actual FFmpeg restart. The solution required forcing a client-side “stop” to clear the stale backend state before attempting restart.
Implemented a two-tier recovery system that starts gentle (fast refresh) and escalates to aggressive (nuclear stop+start) based on recent failure history.
Architecture:
Tier 1: Standard Refresh (Attempts 1-3)
forceRefreshStream() pathTier 2: Nuclear Recovery (Attempts 4+)
stopIndividualStream()startStream() forces backend to create new FFmpeg processFailure Tracking Logic:
// Track failures in 60-second sliding window
this.recentFailures = new Map(); // { cameraId: { timestamps: [], lastMethod: null } }
// On each unhealthy detection:
const history = this.recentFailures.get(cameraId) || { timestamps: [], lastMethod: null };
history.timestamps = history.timestamps.filter(t => now - t < 60000); // Clean old
history.timestamps.push(now);
// Escalation decision:
const recentFailureCount = history.timestamps.length;
const method = (recentFailureCount <= 3) ? 'refresh' : 'nuclear';
Recovery Method Selection:
| Failure Count (60s window) | Method | Action | Use Case |
|---|---|---|---|
| 1-3 | refresh |
forceRefreshStream() |
Transient issues |
| 4+ | nuclear |
UI stop → 3s wait → UI start | Stuck backend state |
Success Detection:
Backend Configuration Fix (app.py):
Added UI_HEALTH_MAX_ATTEMPTS to three locations:
settings = {
# ... existing settings ...
'maxAttempts': _get_int("UI_HEALTH_MAX_ATTEMPTS", 10), # NEW
}
key_mapping = {
# ... existing mappings ...
'UI_HEALTH_MAX_ATTEMPTS': 'maxAttempts' # NEW
}
blankAvg and blankStd from cameras.json into frontend-compatible format.Frontend Escalating Recovery (stream.js):
this.recentFailures = new Map(); // Track failure history for escalating recovery
onUnhealthy callback (lines 47-86) with escalation logic:
Nuclear Recovery Sequence:
if (method === 'nuclear') {
console.log(`[Health] ${cameraId}: Nuclear recovery - forcing UI stop+start cycle`);
// Step 1: UI stop (client-side cleanup)
await this.stopIndividualStream(cameraId, $streamItem, cameraType, streamType);
// Step 2: Wait for backend to notice stream is gone
await new Promise(r => setTimeout(r, 3000));
// Step 3: UI start (forces backend to create new FFmpeg)
const success = await this.startStream(cameraId, $streamItem, cameraType, streamType);
if (success) {
// Clear failure history on success
this.recentFailures.delete(cameraId);
this.restartAttempts.delete(cameraId);
}
}
Before:
[Health] T8416P0023370398: Scheduling restart 1/10 in 5s
After:
[Health] T8416P0023370398: Scheduling Refresh restart 1/∞ in 5s (failures in 60s: 1)
[Health] T8416P0023370398: Executing Refresh attempt 1
[Health] T8416P0023370398: Scheduling Nuclear Stop+Start restart 4/∞ in 20s (failures in 60s: 4)
[Health] T8416P0023370398: Nuclear recovery - forcing UI stop+start cycle
[Health] T8416P0023370398: Nuclear restart succeeded
New logging provides:
maxAttempts = 0Backend:
app.py - _ui_health_from_env() function
maxAttempts to default settings dictUI_HEALTH_MAX_ATTEMPTS to key_mappingFrontend:
stream.js - MultiStreamManager constructor
this.recentFailures Map for failure trackingonUnhealthy callback with escalating recovery logicConfig:
cameras.json - Already had UI_HEALTH_MAX_ATTEMPTS: 0 in ui_health_global_settings (line 2122)Test Environment: Camera T8416P0023370398 (Kids Room) - known to have intermittent connection issues
Scenario 1: Configuration Fix Verification
// Browser console
console.log('UI_HEALTH config:', window.UI_HEALTH);
// Result: { maxAttempts: 0, ... } ✅ (previously undefined)
Scenario 2: Standard Refresh Success (Transient Issue)
[Health] T8416P0023370398: STALE - No new frames for 6.0s
[Health] Stream unhealthy: T8416P0023370398, reason: stale
[Health] T8416P0023370398: Scheduling Refresh restart 1/∞ in 5s (failures in 60s: 1)
[Health] T8416P0023370398: Executing Refresh attempt 1
[Restart] T8416P0023370398: Beginning restart sequence
[Restart] T8416P0023370398: Restart complete
✅ Stream recovered via standard refresh
Scenario 3: Nuclear Recovery Activation (Backend Stuck State)
[Health] T8416P0023370398: STALE - No new frames for 6.0s
[Health] T8416P0023370398: Scheduling Refresh restart 1/∞ in 5s (failures in 60s: 1)
[Health] T8416P0023370398: Executing Refresh attempt 1
HLS fatal error: manifestLoadError (404)
[Restart] T8416P0023370398: Failed
[Health] T8416P0023370398: STALE - No new frames for 6.0s
[Health] T8416P0023370398: Scheduling Refresh restart 2/∞ in 10s (failures in 60s: 2)
[Restart] T8416P0023370398: Failed
[Health] T8416P0023370398: Scheduling Refresh restart 3/∞ in 20s (failures in 60s: 3)
[Restart] T8416P0023370398: Failed
[Health] T8416P0023370398: Scheduling Nuclear Stop+Start restart 4/∞ in 40s (failures in 60s: 4)
[Health] T8416P0023370398: Executing Nuclear Stop+Start attempt 4
[Health] T8416P0023370398: Nuclear recovery - forcing UI stop+start cycle
unified-nvr | Nuclear cleanup for T8416P0023370398 - killing all FFmpeg processes
nvr-packager | [HLS] [muxer T8416P0023370398] created automatically
[Health] T8416P0023370398: Nuclear restart succeeded
✅ Stream recovered via nuclear recovery after 3 refresh failures
Scenario 4: Manual Refresh Comparison
forceRefreshStream())Video Element State Diagnostics:
Frozen stream showing “Stopped” status revealed:
paused: false
readyState: 2 (HAVE_ENOUGH_DATA)
networkState: 2 (LOADING)
currentTime: 90.971284 (advancing)
This disconnect between video element state (“I’m playing!”) and actual frozen frame confirmed the issue was backend FFmpeg death, not frontend player state.
Reliability Improvements:
maxAttempts: 0 honored)Diagnostic Improvements:
User Experience:
Current Limitations:
Backend “Already Active” Check: Backend /api/stream/start/ still doesn’t verify FFmpeg health before returning “already active”. Relies on nuclear recovery to force restart.
Escalation Timer: 60-second window for failure tracking is hardcoded. Could be configurable.
Nuclear Recovery Delay: 3-second wait between stop and start is arbitrary. Could be optimized based on backend cleanup time.
No FFmpeg Health Endpoint: Frontend has no way to query if backend FFmpeg is actually running/healthy. Relies on HLS 404 errors as proxy.
Potential Future Enhancements:
/api/stream/start/Configurable Escalation:
"ui_health_global_settings": {
"UI_HEALTH_ESCALATION_THRESHOLD": 3, // Attempts before nuclear
"UI_HEALTH_FAILURE_WINDOW_MS": 60000, // Sliding window
"UI_HEALTH_NUCLEAR_DELAY_MS": 3000 // Stop→Start gap
}
GET /api/stream/health/{camera_id} returns FFmpeg statusInvestigation Process:
Initial hypothesis: Manual refresh provides autoplay permission (user gesture) → REJECTED (both paths identical)
Second hypothesis: Double-restart (Stop+Play+Refresh) gives backend time → REJECTED (timing already handled)
Third hypothesis: Video element in bad state after failed restart → REJECTED (element reported healthy state)
Fourth hypothesis: Manual refresh resets element state differently → REJECTED (same forceRefreshStream() code)
Final hypothesis (CORRECT): Backend returns “already active” for dead FFmpeg → Health restart gets 404 → Manual Play forces new FFmpeg
Key Insight: The problem was not frontend code differences but backend state management. Health monitor couldn’t force backend to recognize FFmpeg was dead. Solution required client-side “stop” to clear backend tracking before attempting restart.
Hypothetico-Deductive Method Applied:
Camera T8416P0023370398 Ongoing Issues:
This camera (Kids Room) continues to exhibit hardware/network instability:
The escalating recovery strategy successfully handles this camera’s intermittent failures, proving the system works for real-world problematic hardware.
Why UI Can’t Call Backend Stop:
As documented earlier (line 11186), UI deliberately avoids /api/stream/stop/ calls. This is critical for multi-client architecture - multiple browsers viewing the same camera must not interfere with each other.
The nuclear recovery’s “stop” is client-side only (destroys HLS.js, clears video src), then the subsequent “start” forces backend to create new FFmpeg because the client no longer appears to be consuming the stream.
Backend Watchdog Interaction:
Backend has a watchdog process that monitors FFmpeg health, but timing is inconsistent. Sometimes it catches dead processes before health monitor triggers, sometimes after. The nuclear recovery complements (not replaces) backend watchdog by providing frontend-initiated forced restart capability.
Stream State Synchronization:
Frontend State: Backend State: MediaMTX State:
video.playing --> FFmpeg running --> HLS segments
| | |
v v v
Health detects "already active" No new segments
frozen frame (stale tracking) (FFmpeg dead)
| | |
v v v
Refresh fails <-- Returns success <-- 404 on playlist
|
v
Nuclear stop clears frontend state
|
v
Nuclear start forces backend cleanup
|
v
Backend kills dead FFmpeg, starts fresh
|
v
Success
The disconnect between “already active” backend state and actual FFmpeg death required the nuclear recovery’s explicit state clearing to force backend to recognize the problem.
Objective: Implement camera recording settings modal and manual recording controls.
Status: Partially Complete
1. Recording Settings Modal (COMPLETE)
Files created:
static/css/components/recording-modal.css - Professional modal stylingstatic/js/controllers/recording-controller.js - API clientstatic/js/forms/recording-settings-form.js - Form generation with validationstatic/js/modals/camera-settings-modal.js - Modal orchestrationFunctionality:
config/recording_settings.json/api/cameras/<id> to fetch camera capabilities dynamically2. Manual Recording Controls (WORKS FOR RTSP)
Files created:
static/js/controllers/recording-controls.js - Recording button logicFunctionality:
Backend method added:
RecordingService.start_manual_recording() - Separate from motion recording3. Flask API Routes (COMPLETE)
Added to app.py:
GET/POST /api/recording/settings/<camera_id> - Get/update settingsPOST /api/recording/start/<camera_id> - Start manual recordingPOST /api/recording/stop/<recording_id> - Stop recordingGET /api/recording/active - List active recordings4. Configuration Methods Added
Added to config/recording_config_loader.py:
get_camera_settings() - Returns UI-friendly merged settingsupdate_camera_settings() - Saves camera-specific overridesCritical Issues:
recording_source: mjpeg_service fail to recordrecording_config_loader.py._resolve_auto_source():
start_manual_recording() uses ‘motion’ as temporary workaroundRecordingService.start_continuous_recording() method created but not integratedRecording Type Hierarchy:
manual - User-initiated via UI button (no settings check)motion - Event-triggered by ONVIF/FFmpeg (checks motion_recording.enabled)continuous - 24/7 recording (checks continuous_recording.enabled)snapshot - Periodic JPEG capture (checks snapshots.enabled)Recording Source Resolution:
Settings Storage:
recording_settings.jsonProblem: Multiple implementation errors requiring fixes:
start_manual_recording() methodwindow.CAMERAS_DATA variableRoot Cause: Code written without reading existing implementations first
RULE VIOLATION: Failed to follow RULE 7 (read files before modifying)
Lesson: Always use view tool to read actual method signatures, class init parameters, and supported values before writing integration code.
Working:
Not Working:
Evidence:
# Settings saved but no recordings created
ls -l /mnt/sdc/NVR_Recent/continuous # Empty
ls -l /mnt/sdc/NVR_Recent/snapshots # Empty
# Manual recordings work (when source is RTSP/MediaMTX)
ls -l /mnt/sdc/NVR_Recent/motion
# Shows files: AMCREST_LOBBY_20251115_065939.mp4 etc
High Priority (Required for MVP):
/mnt/sdc/NVR_Recent/manual directorygenerate_recording_path() methodactive_recordings before starting new recording_start_mjpeg_recording() implementationstart_continuous_recording() for enabled camerasMedium Priority:
onvif_client.py for event subscriptionstart_motion_recording() on eventsCode Files (7):
recording-modal.cssrecording-controller.jsrecording-controls.jsrecording-settings-form.jscamera-settings-modal.jsonvif_event_listener.py (skeleton)ffmpeg_motion_detector.py (skeleton)Documentation (4):
Manual Edits Required (3 files):
templates/streams.html - Buttons, modal HTML, script importsapp.py - Imports, initialization, API routesconfig/recording_config_loader.py - Two new methodsMethods Added to RecordingService:
start_manual_recording() - User-initiated recordingstart_continuous_recording() - 24/7 recording (needs auto-start integration)Phew…
Objective: Enable simultaneous sub-stream (grid) and main-stream (fullscreen) support per camera.
Status: Partially Complete - Fullscreen works with proper resolution, but some cameras fail to load streams.
Original Issue:
Root Cause:
StreamManager.active_streams used camera_serial as key, allowing only one stream per camera:
self.active_streams[camera_serial] = {...} # "T8416P6024350412" → one stream only
New Composite Key System:
Implemented centralized key management in StreamManager using composite keys:
# Key format: "camera_serial:stream_type"
# Examples: "T8416P6024350412:sub", "T8416P6024350412:main"
def _make_key(self, camera_serial: str, stream_type: str = 'sub') -> str:
return f"{camera_serial}:{stream_type}"
def _get_stream(self, camera_serial: str, stream_type: str = 'sub') -> Optional[dict]:
key = self._make_key(camera_serial, stream_type)
return self.active_streams.get(key)
def _set_stream(self, camera_serial: str, stream_type: str, info: dict) -> None:
key = self._make_key(camera_serial, stream_type)
self.active_streams[key] = info
def _remove_stream(self, camera_serial: str, stream_type: str = 'sub') -> Optional[dict]:
key = self._make_key(camera_serial, stream_type)
return self.active_streams.pop(key, None)
def _get_camera_streams(self, camera_serial: str) -> List[Tuple[str, dict]]:
"""Get all streams (both sub and main) for a camera"""
# Returns list of (stream_type, info) tuples
Benefits:
_make_key() only)stream_type parameter1. streaming/stream_manager.py (COMPLETE REFACTOR)
Key changes:
start_stream() to accept stream_type parameterstop_stream() to accept stream_type parameter_start_stream() to use composite keys throughoutis_stream_healthy() to accept stream_type parameteris_stream_alive() to accept stream_type parameterget_stream_url() to accept stream_type parameterget_active_streams() to return composite keysactive_streams[camera_serial] replaced with helper calls2. streaming/handlers/eufy_stream_handler.py
Updated:
stream_type: str = 'sub' to _build_ll_hls_publish()stream_type to build_ll_hls_output_publish_params()3. streaming/handlers/reolink_stream_handler.py
Updated:
stream_type: str = 'sub' to _build_ll_hls_publish()stream_type to build_ll_hls_output_publish_params()4. streaming/handlers/unifi_stream_handler.py
Updated:
stream_type: str = 'sub' to _build_ll_hls_publish()stream_type to build_ll_hls_output_publish_params()5. streaming/handlers/amcrest_stream_handler.py
No changes needed - doesn’t use LL_HLS publishing path.
Working:
stream_type='main'Broken:
Evidence from logs:
# Working cameras show proper stream type propagation:
INFO:streaming.stream_manager:Started LL-HLS publisher for Living Room (sub)
INFO:streaming.stream_manager:Started LL-HLS publisher for Kids Room (sub)
INFO:streaming.stream_manager:Started LL-HLS publisher for LAUNDRY ROOM (sub)
# But several cameras stuck loading with no error messages
1. Incomplete Handler Updates (SUSPECTED)
Some handlers may not properly propagate stream_type through the entire pipeline:
build_ll_hls_output_publish_params() function signaturebuild_rtsp_output_params() function signatureffmpeg_params.pyInvestigation needed: Check streaming/ffmpeg_params.py for:
grep -n "def build_ll_hls_output_publish_params" ~/0_NVR/streaming/ffmpeg_params.py
grep -n "def build_rtsp_output_params" ~/0_NVR/streaming/ffmpeg_params.py
Verify these functions accept and use stream_type parameter.
2. Missing Stream Type in Some Code Paths
Possible locations where stream_type might not be passed:
_wait_for_playlist() - may need stream_type for composite key lookupget_stream_url() - may return wrong URL format3. Frontend-Backend Stream Type Mismatch
Frontend might be requesting wrong stream type or not properly specifying it:
stream.js fullscreen code for stream type parameter/api/stream/start/<camera_id>?stream_type=main endpointImmediate Investigation Required:
Check Backend Logs for Specific Cameras Failing:
docker logs unified-nvr --tail 200 | grep -E "ERROR|Exception|Failed|<failing_camera_name>"
Verify ffmpeg_params.py Functions Accept stream_type:
view ~/0_NVR/streaming/ffmpeg_params.py
Look for:
build_ll_hls_output_publish_params(camera_config, stream_type, vendor_prefix)build_rtsp_output_params(stream_type, camera_config, vendor_prefix)If missing stream_type parameter, add it and update function body to use it.
/api/stream/start/<camera_id> requestVerify app.py Route Handles stream_type:
grep -A 10 "def start_stream" ~/0_NVR/app.py
Ensure Flask route extracts stream_type from request and passes to stream_manager.start_stream()
Test Individual Camera Startup:
# In container, check if FFmpeg commands are actually running
docker exec unified-nvr ps aux | grep ffmpeg | grep <failing_camera_serial>
If ffmpeg_params.py Missing stream_type Support:
Update these functions to accept and use the parameter:
def build_ll_hls_output_publish_params(
camera_config: Dict,
stream_type: str = 'sub', # ← Add this
vendor_prefix: str = "eufy"
) -> List[str]:
# Inside function, select resolution based on stream_type:
if stream_type == 'main':
resolution = camera_config.get('resolution_main', '1280x720')
else:
resolution = camera_config.get('resolution_sub', '320x240')
# ... rest of function
If app.py Route Missing stream_type Handling:
Update Flask route:
@app.route('/api/stream/start/<camera_id>', methods=['POST'])
def start_stream(camera_id):
stream_type = request.args.get('stream_type', 'sub') # ← Add this
url = stream_manager.start_stream(camera_id, stream_type=stream_type)
# ... rest of route
Once Fixes Applied:
ps aux | grep ffmpeg shows TWO processes for that cameraWatchdog Behavior:
sub streams (grid view)Storage Manager Interaction:
camera_serial without stream_typeMediaMTX Path Naming:
/hls/<camera_serial>/index.m3u8/hls/<camera_serial>_main/index.m3u8 and /hls/<camera_serial>_sub/index.m3u8What Went Wrong:
What Went Right:
Corrective Actions:
High Priority:
streaming/ffmpeg_params.py - Verify stream_type propagationapp.py - Check Flask route extracts stream_type from requestsstatic/js/stream.js - Verify frontend passes stream_type parameterMedium Priority:
streaming/handlers/*_stream_handler.py - Verify all use stream_type correctlyContainer Status: Running with refactored code Cameras Working: ~60% (exact count TBD from user screenshot analysis) Cameras Broken: ~40% (black screens, no error messages visible) Backend Health: Services running, no crashes Frontend Health: UI functional, health monitor active
Critical Files Locations:
~/0_NVR/streaming/stream_manager.py~/0_NVR/streaming/ffmpeg_params.py~/0_NVR/app.py~/0_NVR/static/js/stream.js~/0_NVR/streaming/handlers/*_stream_handler.pyQuick Recovery If Total Failure:
# Restore from backup (if available)
cp ~/0_NVR/streaming/stream_manager.py.backup ~/0_NVR/streaming/stream_manager.py
./deploy.sh
# Or revert handlers:
git checkout streaming/handlers/eufy_stream_handler.py
git checkout streaming/handlers/reolink_stream_handler.py
git checkout streaming/handlers/unifi_stream_handler.py
Continued debugging from Nov 22-23 sessions. Multiple LL_HLS cameras (HALLWAY, STAIRS, OFFICE KITCHEN, Terrace Shed, Kids Room) showing black screens despite FFmpeg processes running successfully.
Initial Finding - Audio Buffer Error: Browser console showed:
HLS fatal error: {type: 'mediaError', parent: 'audio', details: 'bufferAppendError', sourceBufferName: 'audio'}
User had enabled "audio": { "enabled": true } in cameras.json. Disabled audio for all cameras.
Second Finding - Video Buffer Error: After disabling audio, error shifted:
HLS fatal error: {type: 'mediaError', parent: 'main', details: 'bufferAppendError', sourceBufferName: 'video'}
Key Observations:
ps aux confirmed PID active)ERROR:streaming.stream_manager:No process handler for HALLWAY appearingThe composite key refactoring (camera_serial:stream_type) touched 7+ interconnected files:
streaming/stream_manager.py - Core key managementstreaming/ffmpeg_params.py - Resolution parameter handlingstreaming/handlers/eufy_stream_handler.pystreaming/handlers/reolink_stream_handler.pystreaming/handlers/unifi_stream_handler.pystatic/js/streaming/hls-stream.jsstatic/js/streaming/stream.jsThe key format change needed to propagate consistently through every handoff point in the data flow:
Frontend request → app.py → stream_manager → handler → ffmpeg_params → MediaMTX → back to frontend
Treating symptoms in isolation (health checks, key lookups, etc.) failed to address the systemic mismatch across all touchpoints.
Decision: Revert all streaming-related files to pre-refactoring state.
Revert Commit: 7333d12 (Nov 15, 2025)
Command Used:
git checkout 7333d12 -- streaming/stream_manager.py streaming/ffmpeg_params.py streaming/handlers/eufy_stream_handler.py streaming/handlers/reolink_stream_handler.py streaming/handlers/unifi_stream_handler.py static/js/streaming/hls-stream.js static/js/streaming/stream.js
New Branch: NOV_21_RETRIEVAL_on_nov_24_after_fucked_up_refactor_for_sub_and_main
Grid-view sub-resolution and fullscreen main-resolution will need a different architectural approach. The composite key pattern itself is sound, but implementation requires:
TBD: Alternative architecture for dual-stream support.
streaming/stream_manager.pystreaming/ffmpeg_params.pystreaming/handlers/eufy_stream_handler.pystreaming/handlers/reolink_stream_handler.pystreaming/handlers/unifi_stream_handler.pystatic/js/streaming/hls-stream.jsstatic/js/streaming/stream.js