69 Commits

Author SHA1 Message Date
Rene Fichtmueller
f1fe96132f fix: version strings all updated to v0.6.9 (masthead, footer, terminal) 2026-04-04 23:48:30 +02:00
Rene Fichtmueller
f6168f1329 feat: resilience score, route leak detection, data provenance, MCP server
- Resilience Score (1-10): weighted 4-factor model (transit diversity 30%,
  peering breadth 25%, IXP presence 20%, path redundancy 25%), hard cap at
  5.0 on single transit provider. Confidence: HIGH (cross-validated data).
- Route Leak Detection: heuristic Tier-1 sandwich/downstream pattern check.
  Confidence: MEDIUM — pattern-based, not real-time, false positives flagged.
- Data Provenance System: every API response field includes source, validation
  method and confidence level. UI shows green/orange provenance badges.
- MCP Server: exposes PeerCortex as Claude Desktop/Code tools (lookup_asn,
  compare_networks, get_health_report, search_network, get_resilience_score).
2026-04-04 23:46:36 +02:00
Rene Fichtmueller
a5335257a7 docs: add quality audit results + daily audit cron to v0.6.8 changelog 2026-04-03 01:57:50 +02:00
Rene Fichtmueller
9038e280fa fix: bgp.he.net name+country fallback for unregistered ASNs
For ASNs with no PeeringDB entry and no RIPE Stat holder (e.g. reserved
or unannounced ASNs), extract name from bgp.he.net page title and
country code from the /country/XX href. Eliminates the last 2 CRITICAL
audit failures (AS34465 → 'RIPE NCC ASN block'/GB, AS59947 → 'LLHOST
INC. SRL'/RO). Audit result: 80/82 PERFECT, 0 CRITICAL. v0.6.8.
2026-04-03 01:42:56 +02:00
Rene Fichtmueller
9012d2931f fix: RIR+Country empty (RIPE Stat .location field), RDAP parallel race (v0.6.7) 2026-04-02 23:08:54 +00:00
Rene Fichtmueller
9be247410c fix: IXP picker wrong data path + move Facilities card + IX capacity stat 2026-04-02 21:50:12 +00:00
Rene Fichtmueller
d417aa46c6 chore: gitignore runtime caches and large files 2026-04-02 21:40:35 +00:00
Rene Fichtmueller
32bb279c1d feat: add RS column, contacts, timing panel, JSON export, city (v0.6.6) 2026-04-02 21:39:28 +00:00
Rene Fichtmueller
6fb0eb86af feat: add local PeeringDB SQLite integration via peeringdb-py
- Install better-sqlite3 for zero-latency local queries
- queryPeeringDBLocal() handles all major PDB API paths locally:
  /net?asn=X, /netixlan, /netfac, /fac?id__in=, /ixfac, /ix, /ixlan
- fetchPeeringDB() now tries local SQLite first, falls back to live API
- Eliminates rate limits and reduces P99 response times dramatically
- Local DB synced daily at 03:48 via peeringdb-py cron on Erik
- Graceful fallback: if SQLite missing/corrupt, live API used transparently
2026-03-30 21:49:51 +02:00
Rene Fichtmueller
96b6ef2d4a feat: MANRS HTML scraping, AS relationships endpoint, rebrand to ASN News
- MANRS: replace broken Observatory API with public participants page scraping
  (www.manrs.org/netops/participants/), 24h cache, returns pass/fail with member count
- /api/validate: add 'relationships' field (upstreams/downstreams/top_peers)
  sourced from RIPE Stat asn-neighbours, no extra API calls needed
- /api/relationships?asn=X: new dedicated endpoint with resolved AS names,
  full upstream/downstream/peer lists sorted by power score, 10min cache
- editorial: rebrand 'The ASN Newspaper' → 'The ASN News' across index-editorial.html
2026-03-30 21:23:42 +02:00
Rene Fichtmueller
8f51f32dc3 fix: never cache null responses + increase RIPE Stat timeout for large carriers
Root cause of neighbour=0 for large carriers (AS9002, AS3491, AS12956):
1. RIPE Stat asn-neighbours returns 5000+ entries for Tier-1 carriers,
   exceeding the 30s timeout → fetchJSON returns null
2. null was cached in ripeStatCache for 15 minutes (the endpoint TTL)
3. All subsequent requests hit the null cache → perpetual 0 neighbours

Fixes:
- Never cache null results in ripeStatCache (only successful responses)
- Never persist null entries to disk cache
- Increase RIPE Stat timeout from 30s to 45s for prefix/neighbour queries
- Increase RIPE Stat semaphore from 10 to 15 concurrent requests

Verified: AS9002 up=146 down=2702, AS3491 up=90 down=710
2026-03-30 07:58:24 +02:00
Rene Fichtmueller
9bc1292bac fix: add rate-limiting semaphores to audit script
The audit script was flooding RIPE Stat and PeeringDB with unthrottled
parallel requests, causing 429 rate-limits that resulted in auth=0
false negatives (inflating the failure count).

Changes:
- Added threading.Semaphore for RIPE Stat (max 3) and PeeringDB (max 2)
- Added retry logic to _fetch_ripe (was fire-and-forget)
- Increased PDB retries from 2 to 3 with longer backoff (2s, 4s, 6s)
- Increased ASN stagger from 2s to 3s

Results: Accuracy 84% -> 87% (trend: 77% -> 87%, +10%)
2026-03-30 07:22:09 +02:00
Rene Fichtmueller
69650c1875 chore: add CHANGELOG_PENDING for 2026-03-30 session 2026-03-30 06:05:47 +02:00
Rene Fichtmueller
35e0b69442 fix: enrich - skip disambiguation pages, try first-word fallback for compound names 2026-03-30 06:04:34 +02:00
Rene Fichtmueller
0cebb1973f fix: add PeeringDB semaphore (max 5 concurrent) to prevent 429 rate-limits
Previously PDB requests fired in parallel without throttling, causing
rate-limit cascades under audit load. Now all fetchPeeringDB calls
go through a counting semaphore (max 5 concurrent requests).

Results:
- Zero 429 errors in clean test
- AS6939 HE: 327 IX connections (was 0), 338 facilities (was 2)
- AS13335 CF: 413 IX, 221 facilities, 5600 prefixes (94% RPKI valid)
- Audit: 84% accuracy (82% -> 84%, +2%), trend positive
2026-03-30 06:04:24 +02:00
Rene Fichtmueller
a0abfb3a62 fix: WHOIS defensive HTML response check, prevent Unexpected token error 2026-03-30 05:55:43 +02:00
Rene Fichtmueller
96950992df feat: add company enrichment, ASPA timeout guard, map side panel, OIM telecoms
- /api/enrich: Wikipedia + website meta scraping with redirect following
- ASPA /api/aspa: 18s hard timeout guard + 8s per-call limit
- WHOIS: defensive null check
- Map: replace popups with left side panel
- Map: OIM Telecoms fiber layer (OpenInfraMap vector tiles)
- Map layer toggles: fix source-exists early-return bug
- Provider graph: fix text colors for light background
- Network Health: defensive HTML response check
2026-03-30 05:42:38 +02:00
Rene Fichtmueller
df2e176b35 feat: 3-layer data validation cache — local ROA store, PDB cache, RIPE Stat throttling
- Phase 1: Parse ~400k ROAs from Cloudflare RPKI feed into local store
  Eliminates ALL per-prefix RIPE Stat API calls (was 2000+ per lookup)
  Binary search validation in <0.1ms instead of 1-20s HTTP roundtrip
  Disk persistence (.roa-cache.json) for fast restart

- Phase 2: PeeringDB source cache (L2) for net/netixlan/netfac
  6h TTL with LRU eviction (max 5000 entries per type)
  Disk persistence (.pdb-source-cache.json) every 30min + SIGTERM

- Phase 3: RIPE Stat semaphore (max 10 concurrent) + response cache
  Endpoint-specific TTLs (15min-24h based on change rate)
  Max 2000 cached responses, disk persistence

- Phase 4: Extended /api/health with cache status, ASPA adoption metrics
  Version bump to 0.6.0
  Jittered refresh timers to prevent thundering herd
  Graceful shutdown saves all caches

Expected: Audit accuracy 82% -> 95%+, lookup time 90s -> <8s
2026-03-30 05:18:31 +02:00
Rene Fichtmueller
08e9b8d962 fix: auto-start feedback wizard on boot, fix shell show [n] rendering 2026-03-29 16:23:34 +02:00
Rene Fichtmueller
990c989fa3 feat: terminal auto-opens on load at 75% opacity, 50% on hover 2026-03-29 16:19:15 +02:00
Rene Fichtmueller
e302c425c7 fix: move shell.peercortex.org routing before generic / handler 2026-03-29 15:49:19 +02:00
Rene Fichtmueller
58bf76fa82 feat: add terminal feedback widget + admin shell
- index-editorial.html: floating \$_ terminal button (bottom-right)
  - macOS-style title bar (traffic light dots), backdrop blur 18px
  - Guided wizard: category → message → name → submit
  - POST /api/feedback with ASN context auto-filled
  - Safe DOM output builder (no innerHTML on user data)

- server.js: feedback API endpoints
  - POST /api/feedback — stores entries to feedback.json
  - GET /api/feedback?token=... — admin read (token-protected)
  - OPTIONS preflight for CORS
  - FEEDBACK_TOKEN + FEEDBACK_FILE constants from .env
  - Host routing: shell.peercortex.org → shell.html

- public/shell.html: full-screen admin terminal
  - login command → token auth via API
  - list / list [category] — tabular overview
  - show <n> — full entry detail
  - stats — bar chart by category + top ASNs
  - export — JSON file download
  - refresh, logout, clear, help
2026-03-29 15:38:24 +02:00
Rene Fichtmueller
22f219c82e feat: rebrand v2 as 'PeerCortex — The ASN Newspaper' 2026-03-29 15:27:06 +02:00
Rene Fichtmueller
6391823579 feat: add v2.peercortex.org editorial design + Host-based routing 2026-03-29 15:22:25 +02:00
Rene Fichtmueller
fae091801c feat: replace Leaflet map with MapLibre GL + global infrastructure overlays
- Upgrade from Leaflet to MapLibre GL JS 4.7.1 with OpenFreeMap dark base
- Add submarine cable layer (TeleGeography via /api/submarine-cables proxy, 24h cache)
- Add global datacenter layer (PeeringDB all facilities via /api/global-infra proxy)
- Layer toggles: ASN PoPs | Submarine Cables | Global Datacenters
- Dark-themed popup styling matching PeerCortex UI
- Server-side caching for both new data sources (24h TTL)
2026-03-29 08:37:55 +02:00
Rene Fichtmueller
e7dd9a09ce fix(rpki): replace 825k local ROA index with on-demand API + LRU cache
Root cause of 2.7GB RAM usage and 20+ OOM restart loops:
- server loaded all 825k ROAs from Cloudflare RPKI feed into a JS Map
- Every 10min refresh caused double-memory spike (old + new data) -> OOM kill

Solution:
- Remove rpkiRoaIndex Map, addRoaToIndex(), validateRPKILocal(), ipv4ToInt()
- fetchRpkiAspaFeed() now only loads ASPA objects (~1484, negligible RAM)
- Add validateRPKIWithCache(): calls RIPE Stat API per-prefix with a
  5000-entry LRU cache (6h TTL) — same API already used by fetchRPKIPerPrefix()
- Update all 4 call sites: sync .map() -> await Promise.all()

Result: 2.7GB -> ~96MB RAM, no more OOM restarts
2026-03-28 22:29:39 +08:00
Rene Fichtmueller
4b2c6774fa perf(rpki): increase refresh intervals to reduce memory pressure
RPKI feed refresh: 10min -> 4h (RPKI data is stable, RIRs publish once/day)
Atlas probe refresh: 1h -> 12h (probe list rarely changes)

Frequent 825k-ROA reloads caused memory spikes on server with no swap,
triggering OOM kills and PM2 restart loops.
2026-03-28 22:29:03 +08:00
Rene Fichtmueller
f8578a2176 fix(server): catch invalid URL in HTTP handler to prevent XSS-probe crashes
new URL() throws ERR_INVALID_URL on malformed inputs like XSS probe
requests (e.g. //brusEYkk%22%3E%3Cscript%3E...). Uncaught exception
caused memory leak and process restarts. Return HTTP 400 instead.
2026-03-28 22:28:21 +08:00
Rene Fichtmueller
98b5cb1843 fix: prevent rate-limit 0-values under concurrent load
server.js: fetchPeeringDBWithRetry now does 3 attempts with exponential
backoff (2s, 5s) instead of 1 retry at 1.5s. Under audit load (9+
concurrent PDB requests), the longer delays let rate limits clear.

audit.py: stagger ASN submissions by 2s so PeerCortex's internal PDB
requests don't all fire simultaneously. Nightly audit takes ~8min
instead of 5min — acceptable for a midnight cron job.
2026-03-28 18:26:22 +13:00
Rene Fichtmueller
a9ee94466e perf: reduce audit concurrency to 3 to avoid PDB hammering 2026-03-28 15:16:42 +13:00
Rene Fichtmueller
711b89a09e feat: persistent known_issues tracking in ASN registry
When the same field fails 2+ consecutive audit runs, a known_issue
entry is written into the ASN's registry profile with:
- field name, description of what's wrong
- first_seen / last_seen dates, occurrence count
- last auth vs PC values
- status: open (stays until PeerCortex data matches)

Report shows KNOWN ISSUES section (all open issues across registry).
Issues auto-resolve when the ASN passes, or partially resolve when
individual fields are fixed. Also stores ASN name in registry.
2026-03-28 14:02:33 +13:00
Rene Fichtmueller
87ce2ed36a fix: audit.py — distinguish PDB fetch failure from 'not in PDB'
- pdb_present=True/False/None three-state (None = fetch failed)
- Skip IX/fac comparison when PDB fetch failed (avoid false positives)
- Add retry with backoff to _fetch_pdb (2 retries, 1.5s/3s delays)
- Fix datetime.utcnow() deprecation warning
- Report PDB fetch failures separately in summary
2026-03-28 13:22:25 +13:00
Rene Fichtmueller
2b0ba18e40 feat: daily accuracy audit system with rotating ASN batches
- audit/audit.py: nightly audit runs at midnight via cron
  * Rotates through all tracked ASNs (priority: errors > never > oldest)
  * Compares PeerCortex against RIPE Stat + PeeringDB (authoritative)
  * Uses PeeringDB API key (no rate limits)
  * Marks ASNs without PeeringDB entry as peeringdb_absent (fac=0 correct)
  * Self-heal retry on timeout
  * Tracks accuracy trend over time
  * JSON registry + daily reports + human-readable latest_report.txt
- audit/deploy_audit.sh: one-shot setup script (PM2 env fix + cron)
- .gitignore: exclude ecosystem.config.js (contains env secrets)
2026-03-28 12:50:52 +13:00
Rene Fichtmueller
461021a2c7 fix: remove invalid netfac local_asn fallback (returned all records) 2026-03-28 10:58:56 +13:00
Rene Fichtmueller
e63723c2b0 fix: reliable data — retry PeeringDB/RIPE Stat, limit=1000 for IX, fallback when netId=null
- Add fetchPeeringDBWithRetry: 1 retry with 1.5s delay on null response
- Add fetchJSONWithRetry: 1 retry for RIPE Stat prefixes + neighbours
- Log HTTP 429 from PeeringDB instead of silently swallowing it
- Add &limit=1000 to netixlan/netfac queries (prevents truncation at 250)
- Fall back to asn= / local_asn= queries when PeeringDB net lookup fails
  (previously: netId=null → IX=0, fac=0 for ~22 ASNs)
2026-03-28 10:54:39 +13:00
Rene Fichtmueller
036ca861ae fix: bgp.he.net scraper + peering recommendations
bgp.he.net scraper:
- Fixed prefix regex: "Prefixes Originated (v4): 147" format
- Fixed peer regex: "BGP Peers Observed (all): 274" format
- Added prefixes_all field
- AS6830: v4=147, v6=9, peers=274 (was all unavailable)
- Prefix cross-check now works: RIPE 151 vs HE 156 = 97% agreement

Peering Recommendations:
- Now filters out already-established peering sessions
- 3 categories: New Opportunities, Already Peering, No Shared IXP
- Uses BGP neighbour data to detect existing sessions
- Shows "Already peering with all top networks" when applicable
2026-03-28 02:32:50 +13:00
Rene Fichtmueller
f21a8bbba6 feat: Score Breakdown section + fix URL parsing crash
Dashboard: Added "Score Breakdown — Why X/100?" section showing:
- Per-check weight, earned points, and reason
- Total calculation with formula explanation
- Data source attribution
- "info" status excluded from scoring (e.g. MANRS API auth)

Security: try-catch around new URL() parser — malformed URLs from
scanner bots (XSS attempts) now return 400 instead of crashing server.
Was causing repeated crashes from automated vulnerability scanners.
2026-03-28 02:24:51 +13:00
Rene Fichtmueller
5e375fd33d fix: route server threshold, rDNS sample size, IX query reliability
- Route Server: threshold lowered from 20 to 10 IX for "bilateral policy" pass.
  3-9 IX without RS = "info" (not warning). <3 IX = warning.
  AS212635: 19 IX → pass (was warning)
- rDNS: sample size increased from 5 to min(20, total_prefixes)
  Better coverage for large networks (AS13335: was 5/5621 = 0.09%)
- IX Route Server: always use asn= query (more reliable than net_id when PDB rate-limits)
  AS212635: 0 → 19 IX connections correctly detected

AS212635 score: 98 → 100/100
2026-03-28 02:18:56 +13:00
Rene Fichtmueller
0eaad0034f fix: 6 validation improvements from user feedback (AS212635)
1. MANRS: API requires auth → now shows "info" (unable to verify)
   instead of false "not a participant". Excludes from scoring.
2. BGP Visibility: switched from broken visibility API to
   routing-status API. AS212635: 0/0 → 327/327 v4, 319/320 v6
3. Reverse DNS: fixed response parsing (object vs array format).
   AS212635: 0% → 100% coverage
4. ASPA: upstream vs peer classification using power heuristic.
   >10% of max power = likely_upstream, rest = likely_peer.
   AS212635: 53 "providers" → 6 likely_upstream + 47 likely_peer
5. Geolocation: global networks properly detected
6. Score: "info" status excluded from scoring (neutral)

AS212635 score: ~70 → 98/100
2026-03-28 01:49:00 +13:00
Rene Fichtmueller
fd7b2cdb64 fix: validation accuracy for global/anycast networks
- Geolocation: global networks (5+ facility countries) now get pass
  even when MaxMind has no data (was warning)
- Route Server: uses ASN fallback when PeeringDB net_id unavailable
  (was showing "0 IX connections" due to rate limiting)
- IX geocode fallback: CITY_COORDS map + IX_CITY_MAP for 70+ cities

AS49544 (i3D.net/Ubisoft): 100 IX connections correctly detected,
bilateral peering policy recognized, 27-country global presence pass
2026-03-28 01:16:36 +13:00
Rene Fichtmueller
d1825fe327 fix: missing closing brace in renderNetworkMap broke all JS
renderNetworkMap() was missing its closing } after the setTimeout(50)
callback. This caused a SyntaxError that prevented the entire script
from parsing — doLookup was undefined, Lookup button did nothing.

Also added deploy.sh backup script on Erik (auto-backup before restart,
keeps last 20 versions of server.js + index.html).
2026-03-28 01:00:51 +13:00
Rene Fichtmueller
404aef5085 feat: IX location geocode fallback for Network Footprint Map
IXPs without PeeringDB facility coordinates now get geocoded via:
1. City name extraction from IX name (e.g. "France-IX Paris" → Paris)
2. Hard-coded IX ID → city map for 15 well-known IXPs (SwissIX→Zurich etc.)
3. 70+ major networking cities with lat/lon coordinates

AS8283 Coloclue: 9 → 12 IX locations (5 cities: AMS, FRA, Paris, Zurich, Meppel)
AS49544 i3D.net: 100 connections → 20 locations (16 cities worldwide)
2026-03-28 00:52:07 +13:00
Rene Fichtmueller
33d6a84d47 fix: map tiles + PeeringDB rate limit resilience
- Leaflet map: double requestAnimationFrame after display:none removal
  ensures container has real dimensions before L.map() init
- PeeringDB org cache: 24h disk cache (.pdb-org-cache.json) prevents
  hammering PeeringDB API on server restarts (was causing 175 restarts)
- Check HTTP status before JSON.parse on PDB responses
2026-03-27 23:31:32 +13:00
Rene Fichtmueller
f8784bbcec fix: Leaflet map tile rendering in collapsed containers
invalidateSize() + refitBounds after 200ms delay fixes tiles only loading
in top-left corner when map card was initially hidden
2026-03-27 23:19:57 +13:00
Rene Fichtmueller
9aeffda8d1 feat: interactive network footprint map with Leaflet.js
- Leaflet.js (CDN) with CartoDB Dark Matter tiles matching Tokyo Night theme
- Cyan markers: facility/datacenter locations with name + city popup
- Orange markers: IX presence with IX name + speed popup
- Purple connecting lines between facilities in the same country
- Coordinates from PeeringDB facility API (batch lookup, chunked)
- IX locations via ixfac association + facility geocoding
- Auto-fit bounds, graceful degradation if no coordinates
- Collapsible card, XSS-safe popups via DOM API
2026-03-27 11:28:14 +13:00
Rene Fichtmueller
13c5152bf9 feat: multi-source data validation with confidence scoring
- RPKI cross-check: Cloudflare RPKI feed + RIPE NCC Validator API (5 sample prefixes)
- Prefix cross-check: RIPE Stat vs bgp.he.net count comparison
- Neighbour cross-check: RIPE Stat vs bgp.he.net peer data
- Data Quality badge in dashboard (High/Medium/Low confidence)
- Hover tooltip: "Data Quality Report" with per-source agreement breakdown
- Added BETA tag to site header and version string (v0.5.0-beta)
- All UI text in English
2026-03-27 10:22:10 +13:00
Rene Fichtmueller
6fdda92757 fix: critical data accuracy fixes from NOG community feedback
RPKI Validation:
- Validate ALL prefixes (not sample of 10) using local Cloudflare RPKI feed
- Covers all 5 RIRs globally (RIPE, APNIC, ARIN, LACNIC, AFRINIC)
- Indexed ROA lookup (O(bucket) not O(824K)) for instant validation
- AS4739 now correctly shows 446/446 prefixes checked

ASPA Provider Detection:
- Only RIPE Stat "left" neighbours (verified upstreams) used as providers
- AS-path analysis used for frequency confirmation only, not as provider source
- Fixes false provider detection that included peers alongside upstreams

Multi-RIR Support:
- WHOIS/IRR queries all 5 RIR databases via RDAP in parallel
- RPSL validation checks RIPE + APNIC/ARIN/LACNIC/AFRINIC
- AS4739 (APNIC) now correctly found via rdap.apnic.net

Geolocation:
- Anycast/CDN networks (5+ facility countries or Content/NSP type) not flagged
- Only small networks with geo anomalies get warnings

Route Server Scoring:
- Networks with 20+ IX connections and no RS scored as "pass" (bilateral policy)
- Only small networks without RS get warnings

Error Handling:
- ASPA endpoints gracefully handle timeouts (show fallback instead of HTML parse error)
- Frontend checks Content-Type before JSON.parse

Reported by Philip Smith, Richard Steenbergen, Jared Mauch, Chris Malayter
2026-03-27 10:06:17 +13:00
Rene Fichtmueller
3adc34c42b feat: Lia's Paradise country data fix + file upload
- Fix org→country mapping: pre-cache 20k+ PeeringDB orgs at startup
- 22k networks now have country codes (was 0 before)
- Add file upload: CSV/TXT/PDF/XLS/DOC → ASN extraction → probe check
- Export file results as printable PDF
2026-03-27 01:40:03 +13:00
Rene Fichtmueller
41af8be7f4 feat: Lia's Paradise, bug fixes, company descriptions
- Add /lia Easter egg page: RIPE Atlas coverage explorer showing
  34k+ networks grouped by country with probe/no-probe status,
  RIR filtering, search, and PDF export
- Add /api/lia/coverage endpoint combining PeeringDB + Atlas data
- Fix Provider Relationship Graph (renamed var to avoid shadowing)
- Fix ROV/ASPA double-value display (show worst single status)
- Add fallback: render provider graph from lookup data when ASPA fails
- Add company description (org_name) to Network Overview
- Add worstStatus() helper for frontend badge normalization
2026-03-27 01:32:30 +13:00
Rene Fichtmueller
dee5871609 fix: PeeringDB API key + User-Agent WAF fix + bgproutes.io visibility fallback 2026-03-27 00:22:08 +13:00