Server Monitor Dashboard

⚡ Quick Health Check

Primary

Get instant health status across all portals or specific ones

Monitor all portals

Specific Portals Comma-separated portal names

Results Limit

Error Threshold

ℹ️ What this does

Command: python manage.py manual_health health
Computes pages fetched, errors, parse rates, and field fill rates
Error threshold determines healthy/unhealthy status
Returns ranked list sorted by error rate

🔍 Full Portal Check

Detailed

Deep diagnostics: network, parser, and freshness health for a single portal

Portal Name

Lookback Window (hours) Analyze data from the last N hours

ℹ️ Diagnostic Layers

Network: Fetch health, empty HTML, duplicates, error streaks
Parser: Parse rate, field completeness, anomalies
Freshness: Processing lag, backlog, throughput

🔧 Debug Dump

Advanced

Export HTML, selectors, and error bundles for investigation

Dump all portals

Specific Portals

Only unparsed pages ❌ No ad extracted Only pages with errors ⚠️ Has error records Check selector hits (slower) 🐌 Tests all selectors

Page Limit (⚠️ Higher = Slower) Default: 10 pages. With check-selectors ON: ~20-30 seconds per page.

Output Directory (relative to project root) Creates: extractly/debug_dumps/

ℹ️ Bundle Contents & Filter Explanation

Each bundle contains:

page.html - Raw or sliced HTML content
selectors.json - Portal selector configuration
info.json - Metadata, errors, selector hit counts

Filter Options - Real Scenarios:

❌ Only unparsed pages has_ad_manual = false

When to use: Parser success rate is LOW (e.g., 1000 pages → 10 ads = 1% success)
What it shows: The 990 FAILED pages where parser ran but created NO ad
Use case: "Why did 990 pages fail? Missing required fields? Broken selectors? Site structure changed?"

⚠️ Only pages with errors NetworkPageError records exist

When to use: Error rate is HIGH in logs (lots of 404s, timeouts, exceptions)
What it shows: ONLY pages with logged errors (network failures, parse crashes)
Use case: "What's causing errors? Same pages failing? One portal broken? Need to fix scraper logic?"

🐌 Check selector hits (SLOW) Tests all selectors with BeautifulSoup

When to use: Parser creates ads but fields are EMPTY (lat:0, lon:0, price:null)
What it shows: Hit count for EVERY selector (gas:1, lat:0, lon:0, price:1...)
Use case: "Which selectors work vs broken? lat/lon selectors broken? Need to fix CSS selectors?"
⚠️ Speed: 20-30 seconds per page (10 pages = 5+ minutes)

💡 Pro Tip: Combine filters! Check "unparsed + check selectors" to see exactly which selectors fail on failed pages.

✅ Selector Linter

Validation

Validate selector configurations against model schema

Lint all portals

Specific Portals

ℹ️ What gets validated

Unknown field paths (not in AdsManual model)
Invalid configuration properties
Reports issues per portal for quick fixes

📚 System Documentation

Health Check Thresholds

Network Layer

Empty HTML: <10% healthy, >20% warning
Duplicate HTML: >80% critical (captcha/block)
Error streak: <10 healthy, >5 warning

Parser Layer

Parse rate: >70% healthy, <50% critical
Error rate: <15% healthy, >30% critical
Critical fields: >75% healthy, <60% warning

Freshness Layer

Lag: <4h healthy, >4h critical
Backlog: <1000 healthy, >1000 warning
Throughput: >5 pages/hour expected

CLI Commands

python manage.py manual_health health --all

Quick health snapshot for all portals

python manage.py manual_health run --portal otodom

Full diagnostics for a specific portal

python manage.py manual_health debug-dump --name otodom --check-selectors

Create debug bundles with selector validation

python manage.py manual_health lint --all

Validate selector configurations

Data Models

SourceManual: Portal configuration and selectors
NetworkMonitoredPage: Fetched pages with HTML content
NetworkPageError: Network and page-level errors
AdsManual: Parsed advertisement data

HOUSLY

🏥 Server Health Monitor

⚡ Quick Health Check

📊 Health Check Results