27 Commits

Author SHA1 Message Date
Rene Fichtmueller
e302c425c7 fix: move shell.peercortex.org routing before generic / handler 2026-03-29 15:49:19 +02:00
Rene Fichtmueller
58bf76fa82 feat: add terminal feedback widget + admin shell
- index-editorial.html: floating \$_ terminal button (bottom-right)
  - macOS-style title bar (traffic light dots), backdrop blur 18px
  - Guided wizard: category → message → name → submit
  - POST /api/feedback with ASN context auto-filled
  - Safe DOM output builder (no innerHTML on user data)

- server.js: feedback API endpoints
  - POST /api/feedback — stores entries to feedback.json
  - GET /api/feedback?token=... — admin read (token-protected)
  - OPTIONS preflight for CORS
  - FEEDBACK_TOKEN + FEEDBACK_FILE constants from .env
  - Host routing: shell.peercortex.org → shell.html

- public/shell.html: full-screen admin terminal
  - login command → token auth via API
  - list / list [category] — tabular overview
  - show <n> — full entry detail
  - stats — bar chart by category + top ASNs
  - export — JSON file download
  - refresh, logout, clear, help
2026-03-29 15:38:24 +02:00
Rene Fichtmueller
6391823579 feat: add v2.peercortex.org editorial design + Host-based routing 2026-03-29 15:22:25 +02:00
Rene Fichtmueller
fae091801c feat: replace Leaflet map with MapLibre GL + global infrastructure overlays
- Upgrade from Leaflet to MapLibre GL JS 4.7.1 with OpenFreeMap dark base
- Add submarine cable layer (TeleGeography via /api/submarine-cables proxy, 24h cache)
- Add global datacenter layer (PeeringDB all facilities via /api/global-infra proxy)
- Layer toggles: ASN PoPs | Submarine Cables | Global Datacenters
- Dark-themed popup styling matching PeerCortex UI
- Server-side caching for both new data sources (24h TTL)
2026-03-29 08:37:55 +02:00
Rene Fichtmueller
e7dd9a09ce fix(rpki): replace 825k local ROA index with on-demand API + LRU cache
Root cause of 2.7GB RAM usage and 20+ OOM restart loops:
- server loaded all 825k ROAs from Cloudflare RPKI feed into a JS Map
- Every 10min refresh caused double-memory spike (old + new data) -> OOM kill

Solution:
- Remove rpkiRoaIndex Map, addRoaToIndex(), validateRPKILocal(), ipv4ToInt()
- fetchRpkiAspaFeed() now only loads ASPA objects (~1484, negligible RAM)
- Add validateRPKIWithCache(): calls RIPE Stat API per-prefix with a
  5000-entry LRU cache (6h TTL) — same API already used by fetchRPKIPerPrefix()
- Update all 4 call sites: sync .map() -> await Promise.all()

Result: 2.7GB -> ~96MB RAM, no more OOM restarts
2026-03-28 22:29:39 +08:00
Rene Fichtmueller
4b2c6774fa perf(rpki): increase refresh intervals to reduce memory pressure
RPKI feed refresh: 10min -> 4h (RPKI data is stable, RIRs publish once/day)
Atlas probe refresh: 1h -> 12h (probe list rarely changes)

Frequent 825k-ROA reloads caused memory spikes on server with no swap,
triggering OOM kills and PM2 restart loops.
2026-03-28 22:29:03 +08:00
Rene Fichtmueller
f8578a2176 fix(server): catch invalid URL in HTTP handler to prevent XSS-probe crashes
new URL() throws ERR_INVALID_URL on malformed inputs like XSS probe
requests (e.g. //brusEYkk%22%3E%3Cscript%3E...). Uncaught exception
caused memory leak and process restarts. Return HTTP 400 instead.
2026-03-28 22:28:21 +08:00
Rene Fichtmueller
98b5cb1843 fix: prevent rate-limit 0-values under concurrent load
server.js: fetchPeeringDBWithRetry now does 3 attempts with exponential
backoff (2s, 5s) instead of 1 retry at 1.5s. Under audit load (9+
concurrent PDB requests), the longer delays let rate limits clear.

audit.py: stagger ASN submissions by 2s so PeerCortex's internal PDB
requests don't all fire simultaneously. Nightly audit takes ~8min
instead of 5min — acceptable for a midnight cron job.
2026-03-28 18:26:22 +13:00
Rene Fichtmueller
461021a2c7 fix: remove invalid netfac local_asn fallback (returned all records) 2026-03-28 10:58:56 +13:00
Rene Fichtmueller
e63723c2b0 fix: reliable data — retry PeeringDB/RIPE Stat, limit=1000 for IX, fallback when netId=null
- Add fetchPeeringDBWithRetry: 1 retry with 1.5s delay on null response
- Add fetchJSONWithRetry: 1 retry for RIPE Stat prefixes + neighbours
- Log HTTP 429 from PeeringDB instead of silently swallowing it
- Add &limit=1000 to netixlan/netfac queries (prevents truncation at 250)
- Fall back to asn= / local_asn= queries when PeeringDB net lookup fails
  (previously: netId=null → IX=0, fac=0 for ~22 ASNs)
2026-03-28 10:54:39 +13:00
Rene Fichtmueller
036ca861ae fix: bgp.he.net scraper + peering recommendations
bgp.he.net scraper:
- Fixed prefix regex: "Prefixes Originated (v4): 147" format
- Fixed peer regex: "BGP Peers Observed (all): 274" format
- Added prefixes_all field
- AS6830: v4=147, v6=9, peers=274 (was all unavailable)
- Prefix cross-check now works: RIPE 151 vs HE 156 = 97% agreement

Peering Recommendations:
- Now filters out already-established peering sessions
- 3 categories: New Opportunities, Already Peering, No Shared IXP
- Uses BGP neighbour data to detect existing sessions
- Shows "Already peering with all top networks" when applicable
2026-03-28 02:32:50 +13:00
Rene Fichtmueller
f21a8bbba6 feat: Score Breakdown section + fix URL parsing crash
Dashboard: Added "Score Breakdown — Why X/100?" section showing:
- Per-check weight, earned points, and reason
- Total calculation with formula explanation
- Data source attribution
- "info" status excluded from scoring (e.g. MANRS API auth)

Security: try-catch around new URL() parser — malformed URLs from
scanner bots (XSS attempts) now return 400 instead of crashing server.
Was causing repeated crashes from automated vulnerability scanners.
2026-03-28 02:24:51 +13:00
Rene Fichtmueller
5e375fd33d fix: route server threshold, rDNS sample size, IX query reliability
- Route Server: threshold lowered from 20 to 10 IX for "bilateral policy" pass.
  3-9 IX without RS = "info" (not warning). <3 IX = warning.
  AS212635: 19 IX → pass (was warning)
- rDNS: sample size increased from 5 to min(20, total_prefixes)
  Better coverage for large networks (AS13335: was 5/5621 = 0.09%)
- IX Route Server: always use asn= query (more reliable than net_id when PDB rate-limits)
  AS212635: 0 → 19 IX connections correctly detected

AS212635 score: 98 → 100/100
2026-03-28 02:18:56 +13:00
Rene Fichtmueller
0eaad0034f fix: 6 validation improvements from user feedback (AS212635)
1. MANRS: API requires auth → now shows "info" (unable to verify)
   instead of false "not a participant". Excludes from scoring.
2. BGP Visibility: switched from broken visibility API to
   routing-status API. AS212635: 0/0 → 327/327 v4, 319/320 v6
3. Reverse DNS: fixed response parsing (object vs array format).
   AS212635: 0% → 100% coverage
4. ASPA: upstream vs peer classification using power heuristic.
   >10% of max power = likely_upstream, rest = likely_peer.
   AS212635: 53 "providers" → 6 likely_upstream + 47 likely_peer
5. Geolocation: global networks properly detected
6. Score: "info" status excluded from scoring (neutral)

AS212635 score: ~70 → 98/100
2026-03-28 01:49:00 +13:00
Rene Fichtmueller
fd7b2cdb64 fix: validation accuracy for global/anycast networks
- Geolocation: global networks (5+ facility countries) now get pass
  even when MaxMind has no data (was warning)
- Route Server: uses ASN fallback when PeeringDB net_id unavailable
  (was showing "0 IX connections" due to rate limiting)
- IX geocode fallback: CITY_COORDS map + IX_CITY_MAP for 70+ cities

AS49544 (i3D.net/Ubisoft): 100 IX connections correctly detected,
bilateral peering policy recognized, 27-country global presence pass
2026-03-28 01:16:36 +13:00
Rene Fichtmueller
d1825fe327 fix: missing closing brace in renderNetworkMap broke all JS
renderNetworkMap() was missing its closing } after the setTimeout(50)
callback. This caused a SyntaxError that prevented the entire script
from parsing — doLookup was undefined, Lookup button did nothing.

Also added deploy.sh backup script on Erik (auto-backup before restart,
keeps last 20 versions of server.js + index.html).
2026-03-28 01:00:51 +13:00
Rene Fichtmueller
404aef5085 feat: IX location geocode fallback for Network Footprint Map
IXPs without PeeringDB facility coordinates now get geocoded via:
1. City name extraction from IX name (e.g. "France-IX Paris" → Paris)
2. Hard-coded IX ID → city map for 15 well-known IXPs (SwissIX→Zurich etc.)
3. 70+ major networking cities with lat/lon coordinates

AS8283 Coloclue: 9 → 12 IX locations (5 cities: AMS, FRA, Paris, Zurich, Meppel)
AS49544 i3D.net: 100 connections → 20 locations (16 cities worldwide)
2026-03-28 00:52:07 +13:00
Rene Fichtmueller
33d6a84d47 fix: map tiles + PeeringDB rate limit resilience
- Leaflet map: double requestAnimationFrame after display:none removal
  ensures container has real dimensions before L.map() init
- PeeringDB org cache: 24h disk cache (.pdb-org-cache.json) prevents
  hammering PeeringDB API on server restarts (was causing 175 restarts)
- Check HTTP status before JSON.parse on PDB responses
2026-03-27 23:31:32 +13:00
Rene Fichtmueller
9aeffda8d1 feat: interactive network footprint map with Leaflet.js
- Leaflet.js (CDN) with CartoDB Dark Matter tiles matching Tokyo Night theme
- Cyan markers: facility/datacenter locations with name + city popup
- Orange markers: IX presence with IX name + speed popup
- Purple connecting lines between facilities in the same country
- Coordinates from PeeringDB facility API (batch lookup, chunked)
- IX locations via ixfac association + facility geocoding
- Auto-fit bounds, graceful degradation if no coordinates
- Collapsible card, XSS-safe popups via DOM API
2026-03-27 11:28:14 +13:00
Rene Fichtmueller
13c5152bf9 feat: multi-source data validation with confidence scoring
- RPKI cross-check: Cloudflare RPKI feed + RIPE NCC Validator API (5 sample prefixes)
- Prefix cross-check: RIPE Stat vs bgp.he.net count comparison
- Neighbour cross-check: RIPE Stat vs bgp.he.net peer data
- Data Quality badge in dashboard (High/Medium/Low confidence)
- Hover tooltip: "Data Quality Report" with per-source agreement breakdown
- Added BETA tag to site header and version string (v0.5.0-beta)
- All UI text in English
2026-03-27 10:22:10 +13:00
Rene Fichtmueller
6fdda92757 fix: critical data accuracy fixes from NOG community feedback
RPKI Validation:
- Validate ALL prefixes (not sample of 10) using local Cloudflare RPKI feed
- Covers all 5 RIRs globally (RIPE, APNIC, ARIN, LACNIC, AFRINIC)
- Indexed ROA lookup (O(bucket) not O(824K)) for instant validation
- AS4739 now correctly shows 446/446 prefixes checked

ASPA Provider Detection:
- Only RIPE Stat "left" neighbours (verified upstreams) used as providers
- AS-path analysis used for frequency confirmation only, not as provider source
- Fixes false provider detection that included peers alongside upstreams

Multi-RIR Support:
- WHOIS/IRR queries all 5 RIR databases via RDAP in parallel
- RPSL validation checks RIPE + APNIC/ARIN/LACNIC/AFRINIC
- AS4739 (APNIC) now correctly found via rdap.apnic.net

Geolocation:
- Anycast/CDN networks (5+ facility countries or Content/NSP type) not flagged
- Only small networks with geo anomalies get warnings

Route Server Scoring:
- Networks with 20+ IX connections and no RS scored as "pass" (bilateral policy)
- Only small networks without RS get warnings

Error Handling:
- ASPA endpoints gracefully handle timeouts (show fallback instead of HTML parse error)
- Frontend checks Content-Type before JSON.parse

Reported by Philip Smith, Richard Steenbergen, Jared Mauch, Chris Malayter
2026-03-27 10:06:17 +13:00
Rene Fichtmueller
976bdb48e4 perf: optimize compare endpoint + add caching everywhere
- Compare: all API calls in single parallel batch (was sequential)
- Compare: RPKI sample reduced to 3+3 prefixes with 5s timeout cap
- Compare: response caching (5min TTL)
- Compare: AS name resolution parallel with 3s timeout
- Result: Compare from timeout (>20s) to ~5s first call, <1s cached
2026-03-26 12:54:22 +13:00
Rene Fichtmueller
267943b647 feat: performance fixes + deploy directory with live dashboard
- Add response caching (5min TTL for lookups, 10min for ASPA)
- Add 8s timeout to all external API fetches
- RPKI validation: sample max 10 prefixes (5 v4 + 5 v6) instead of 50
- Run all PeeringDB + RIPE Stat calls in single parallel batch
- Resolve AS names in parallel with 3s timeout cap
- Add deploy/ directory with production server.js + index.html dashboard
- Landing page: Tokyo Night dark theme, interactive ASN search
- 15 API endpoints: lookup, aspa, aspa/verify, bgproutes, validate,
  compare, peers/find, prefix/detail, ix/detail, topology, whois, health
- Features: RPKI per-prefix, RIPE Atlas probes, Network Health Report,
  ASPA RFC verification engine, Provider Relationship Graph
2026-03-26 12:50:54 +13:00
Rene Fichtmueller
967a0a827b fix: resolve AS names via RIPE Stat AS overview API 2026-03-26 11:26:58 +13:00
Rene Fichtmueller
405bfd01c7 fix: resolve double ASN display in ASPA provider badges 2026-03-26 11:20:02 +13:00
Rene Fichtmueller
cdf21b9e8e feat: add RIPE Atlas probe integration to dashboard
- Query RIPE Atlas API for probes in the looked-up ASN
- Display probe count, connected/disconnected status, anchors
- Expandable probe detail table with links to atlas.ripe.net
- Connection ratio progress bar
- "Host a probe?" prompt for networks without Atlas presence
2026-03-26 11:14:41 +13:00
Rene Fichtmueller
fc58394555 feat: complete dashboard with ASPA, bgproutes.io, enhanced RPKI
- Full network intelligence dashboard (777-line HTML)
- ASPA Intelligence: provider detection, object generator, path analysis
- bgproutes.io integration: 3293 vantage points, RIB queries, ROV+ASPA status
- Enhanced RPKI: per-prefix validation, coverage percentage, expandable details
- Enhanced Compare: common upstreams, RPKI coverage comparison
- API endpoints: /api/lookup, /api/aspa, /api/bgproutes, /api/compare, /api/health
- All data sources queried in parallel for speed
- Tokyo Night dark theme, responsive, loading states
2026-03-26 10:23:44 +13:00