387 Commits

Author SHA1 Message Date
Rene Fichtmueller
653824f23b fix: Cisco line card URL mapping (8800/84/86 → 8000 family page, skip ASR9K logo-only) 2026-04-21 00:49:32 +02:00
Rene Fichtmueller
c9333ab5ea fix: MikroTik hardcoded slug map for + models (crs305/312/317/326) 2026-04-21 00:45:41 +02:00
Rene Fichtmueller
9618a4f0e0 fix: Cisco 8000 builder URL + MikroTik lowercase + new vendor builders
URL builder fixes:
- Cisco 8000: update to new /site/us/en/ URL scheme (family page, not per-model)
- MikroTik: fix to lowercase+underscore format (was uppercase, caused 404)
- Fortinet: set to null — JS-rendered pages, all redirect to generic page
- Alcatel-Lucent Enterprise slug added to dispatcher (was missing, caused 0 hits)
- Add Quanta, Allied Telesis, Ufispace, Netgear URL builders
- NVIDIA: skip ConnectX/BlueField non-switch models

Migration 044:
- Clear 35 wrong NCS-5500 URLs from Cisco 8000-series models
- Pre-set correct 8000-series family URL for 21 models without images
2026-04-21 00:41:31 +02:00
Rene Fichtmueller
f340482752 fix: monitor-erik.sh — correct Erik SSH target + fix awk header skip 2026-04-21 00:34:28 +02:00
Rene Fichtmueller
9e6be570a3 feat: more switch image coverage + system health metrics + Erik monitor
switch-image-fetcher:
- Add Fortinet URL builder (11 FortiSwitch models)
- Add Quanta Cloud Technology, Allied Telesis, Ufispace, Netgear URL builders
- Fix alcatel-lucent-enterprise slug missing from URL_BUILDERS dispatcher
- Fix NVIDIA builder to skip ConnectX/BlueField adapters (not switches)
- Add aruba slug alias for hpe-aruba

health endpoint:
- Add system metrics: CPU load (1/5/15m), memory usage, disk usage
- Add load_status indicator (ok/busy/overloaded)
- Expose process RSS memory
- Used by external monitors

scripts/monitor-erik.sh:
- Cron-ready health check script for Claudi (.82) and Raspberry Pis
- Checks TIP API health endpoint (load, memory, disk, DB latency)
- Checks PM2 process state via SSH (errored/stopped detection)
- ntfy.sh push notifications (set NTFY_TOPIC env var)
- Includes systemd service + timer unit comments for auto-install
2026-04-21 00:31:43 +02:00
Rene Fichtmueller
823b64bd24 perf: load-aware scraper guard + higher rate limits + /tmp crawlee storage 2026-04-20 23:35:02 +02:00
Rene Fichtmueller
a2492d833b feat: Flexoptix order section per switch + reject generic/logo images 2026-04-20 23:31:36 +02:00
Rene Fichtmueller
0a60d821fb fix: expand compatibility.verification_method CHECK to include vendor_compat + spec_match 2026-04-20 23:23:23 +02:00
Rene Fichtmueller
4f277df703 fix(migration-041): use 'manual' doc_type instead of config_guide (CHECK constraint) 2026-04-20 23:17:41 +02:00
Rene Fichtmueller
dfadc8eb4e fix: add certifications column to switches table (migration 039a) 2026-04-20 23:16:52 +02:00
Rene Fichtmueller
5737ae0362 ui: update Finder quick examples to use actual seeded switch models
Replace placeholder models (93180YC-FX3, 7280R3A, QFX5120 — not in DB) with
the real seeded switches: N9K-C9364C, 93600CD-GX, 7060CX2-32S, QFX5130-32CD, SN5600
2026-04-20 23:00:36 +02:00
Rene Fichtmueller
ab059c2fd1 fix(community-issues): scrapeTransceiverCompatIssues falls back to ports_config when no compat entries 2026-04-20 23:00:00 +02:00
Rene Fichtmueller
8cf19e9b78 data: migration 041 — seed switch datasheets (Cisco, Arista, Juniper, NVIDIA)
Seeds known-good official datasheet PDFs and config guide URLs for the 24 initial
seeded switches. Directly visible in switch detail panel → Datasheets & Manuals.
2026-04-20 22:58:18 +02:00
Rene Fichtmueller
21f5250353 feat: switch facts — migration 040 seeds power/weight/certifications + dashboard shows them
- Migration 040: seeds rack_units, typical_power_w, max_power_w, weight_kg, certifications
  for 23 initial switches (Cisco Nexus/Catalyst, Arista, Juniper, NVIDIA, Edgecore/Celestica/Asterfusion)
- Dashboard: Specifications section now shows Typical Power, Weight, Certifications (colored pills)
2026-04-20 22:56:53 +02:00
Rene Fichtmueller
307a5ea38a chore: update changelog for 2026-04-20 switch image + compat features 2026-04-20 22:53:24 +02:00
Rene Fichtmueller
c0cd0dc1ca feat: compatibility panel — verification_method, competitor prices, spec-match collapsible
- API: getCompatibleTransceivers() returns verification_method, orders vendor_compat first
- Dashboard: Flexoptix section splits vendor-tested vs spec-match (collapsed)
- Dashboard: Competitor section shows vendor-tested with prices, spec-match as chips
2026-04-20 22:52:49 +02:00
Rene Fichtmueller
4bf5c95824 feat: Flexoptix compatibility scraper + transceiver issue scanner
- Add flexoptix-compat.ts: maps switch models to compatible Flexoptix transceivers
  via search API (vendor_compat) with form-factor fallback (spec_match)
  Scheduled daily at 09:00 UTC as scrape:compat:flexoptix
- Enhance community-issues.ts: add vendor advisory sources (Cisco Field Notices,
  Juniper KB, SONiC GitHub Issues) + new scrapeTransceiverCompatIssues() that
  searches for switch+transceiver combination problems specifically
- Scheduler: 59 schedules, 78 workers
2026-04-20 22:50:57 +02:00
Rene Fichtmueller
a0a7a97d83 feat: switch image fetcher + og:image scheduler job + dashboard thumbnail column
- Add switch-image-fetcher.ts: og:image-based image discovery for all 86 seeded switches
  (covers Cisco, Arista, Juniper, NVIDIA, Edgecore, Celestica, Asterfusion, Dell,
   HPE/Aruba, Huawei, Nokia, Extreme, MikroTik, Ubiquiti, FS.COM, Supermicro)
- Wire fetchSwitchImages() into scheduler as scrape:images:switches (daily 08:30 UTC)
- Dashboard: add 48px thumbnail column to switch table (lazy img with gear icon fallback)
2026-04-20 22:44:08 +02:00
Rene Fichtmueller
aa91798e8d fix(vcelink): resolve TS 5.9 narrowing quirk with explicit cast in dead code
price?: number narrowing via typeof/!== undefined does not work for
arithmetic comparisons in TypeScript 5.9 dead code paths; use 'as number'
cast to keep the dead code compilable while the early-return guard above
prevents runtime execution entirely.
2026-04-20 22:18:13 +02:00
Rene Fichtmueller
1aba912a15 fix(scrapers): fix ATGBics theme migration, NADDOD URL, disable VCELink
- ATGBics: update HTML parser from old card--product theme to new
  card__info theme (Shopify template changed April 2026); name now
  extracted from href link text instead of aria-label
- NADDOD: correct ensureVendor shop URL from /collections/transceivers
  (404) to /collection/optical-transceivers
- VCELink: disable scraper — site pivoted from optical transceivers to
  audio/video/cable products; all collection URLs return 404
2026-04-20 22:11:24 +02:00
Rene Fichtmueller
ca943f1f86 ui: comprehensive DEMO/MODELL tagging across all dashboard sections with synthetic data
- Stock tab nav: ⚠DEMO badge
- Stock section subtitle: clarify prices=real vs. Lager/Verkauf=DEMO
- Stat cards: DE-Lager, Global-Lager, Nachlieferung labels tagged [DEMO]
- Recently Restocked header: DEMO DATA badge
- Stock detail lookup: [demo] inline on all warehouse/units_sold fields
- Top Sellers: already tagged (previous commit)
- Procurement > Reorder Signals: DEMO DATA banner (based on synthetic ABC data)
- Procurement > ABC Classification: DEMO DATA banner
- Hype Cycle: MODELL badge on header (Norton-Bass = mathematical estimate)
- Hype Cycle table: Adoption/Peak/To Plateau columns tagged [M] = Modell
- Hype Cycle legend: explains [M] vs real data
- Market Intelligence + Lifecycle Events: no tag (real scraped data)
2026-04-20 21:52:10 +02:00
Rene Fichtmueller
9f3cd46f9c ui: mark Top Sellers widget data as DEMO (synthetic seed data, not real sales) 2026-04-20 21:44:33 +02:00
Rene Fichtmueller
0fb4850dfa fix: price-comparison SKU lookup — wrong column refs (so.stock_level, search_url_template) 2026-04-19 00:12:18 +02:00
Rene Fichtmueller
b0ed54f386 feat: register fiber24 + fibermall in index, move atgbics to fetch-only section 2026-04-18 22:50:52 +02:00
Rene Fichtmueller
cb5a587d7e feat: rewrite ATGBICS scraper — static HTML, correct collection handles, GBP cookie
- Replaces Playwright with pure fetch() — static HTML has prices
- Correct collection handles (compatible-transceivers-sfpp-10g etc.)
- Cookie: cart_currency=GBP forces GBP pricing from any geo-IP
- Handles 35+ pages per category × 24 products = 840+ SFP+ products
- No IP-blocking with static HTML (Playwright was the trigger)
- Adds scripts/run-atgbics-mac.sh for Mac-side runner if needed
2026-04-18 22:48:29 +02:00
Rene Fichtmueller
785a6731ab fix: fiber24 stockLevel on_request (was unknown — violated DB constraint) 2026-04-18 22:26:45 +02:00
Rene Fichtmueller
d4ad9f4641 fix: ShopFiber24 sitemap-based scraping + Fibermall image extraction
ShopFiber24 (fiber24.ts):
- Complete rewrite: was using JS-rendered catalog (all prices = 0)
- New strategy: fetch sitemap_0.xml.gz → 310 product DE-URLs
- Each product page has Schema.org microdata: itemprop=price, sku, image
- Extracts: price (minPrice), SKU, image_url, name, specs
- Rate: 1 req/1.5s, no Playwright needed

FiberMall (fibermall.ts):
- Add imageUrl to Product interface
- Extract first fibermall.com/photo/*.jpg from product listing card
- Write image_url to transceivers table (has_image=true) on upsert
- SKU variants share parent product image
- 304 FiberMall transceivers will get images on next scraper run
2026-04-18 22:20:57 +02:00
Rene Fichtmueller
446ac667b0 feat: side-by-side competitor comparison + fix 1.6T speed_gbps
- Fix OSFP-DR8-1.6T-FL and OSFP-2FR4-1.6T-FL: speed_gbps was 200, now 1600
  → FS.com 1.6T products now correctly match as comparables for Flexoptix O.1316T.C.05.M
- API: extend comparable price query to return comp_form_factor, comp_speed_gbps,
  comp_reach_meters, comp_reach_label, comp_fiber_type, comp_wavelengths
- Dashboard: replace plain comparable price row with side-by-side spec comparison card
  showing Flexoptix vs. competitor: Form Factor, Speed, Reach, Fiber, Wavelengths
  with color coding (green=match, orange=mismatch) and savings badge (−45% günstiger)
2026-04-18 21:51:41 +02:00
Rene Fichtmueller
62d97a783c feat: add claude-code LLM provider + update dashboard to fo-blog-v5
- client.ts: add claude-code provider routing BLOG_LLM_PROVIDER=claude-code
  to claude-bridge (flat-rate, no API billing via Claude Code subscription)
- checkHealth() now pings /health on claude-bridge for real availability check
- Default OLLAMA_LLM_MODEL changed from qwen2.5:14b to fo-blog-v5
- Dashboard: add claude-code card (EMPFOHLEN), rename fo-blog-v3 → fo-blog-v5
- loadBlogLLMStatus() handles all 3 providers: claude-code/anthropic/ollama
- Grid expanded from 3 to 4 columns to accommodate new card
- ecosystem.config.js + .env on Erik: OLLAMA_LLM_MODEL=fo-blog-v5 confirmed
2026-04-18 20:45:14 +02:00
Rene Fichtmueller
74b83de6e9 docs: changelog — tunnel DNS fix, image backfill, OSFP coverage 2026-04-18 13:23:57 +02:00
Rene Fichtmueller
1da4abc488 fix: FS.com price extraction — DOM-based prices + shipping-context exclusion
- All 247 FS.com prices were €79 (shipping threshold, not product prices)
- Root cause: 'Gratis Versand ab 79 € (ohne MwSt.)' banner matched first
- Fix 1: DOM price extraction in page.evaluate with bad-parent skip list
- Fix 2: bodyText qualified patterns skip matches near shipping keywords
- Fix 3: waitForSelector for price DOM element before evaluate
- Fix 4: Deleted 247 invalid €79 observations from DB

Also included from previous session:
- db.ts: set has_image=true on image writes (fix 632 desync rows)
- spec-updater.ts: DR/FR/LR/ER/ZR → SMF, SR → MMF fiber type inference
2026-04-18 13:10:35 +02:00
Rene Fichtmueller
f8a1d27e79 fix: add missing auth header to blog generate fetches
Both generateBlog() and generateBlogManual() were calling
POST /api/blog/generate without an Authorization: Bearer header.
The requireAuth middleware correctly returned 401, which appeared
as 'Unauthorized — please log in' toast in the dashboard.

Fix: read loadToken() before each fetch and include the token in
the Authorization header. Also add r.status===401 guard to redirect
to login page when token expires, instead of showing error toast.
2026-04-18 08:03:39 +02:00
Rene Fichtmueller
6a33a17bca chore: changelog — Crawlee queue wipe, ATGBICS fix, Optcore skip 2026-04-18 05:42:37 +02:00
Rene Fichtmueller
48adcd3fc9 fix: skip Optcore on Erik — Cloudflare blocks datacenter IP
optcore.net blocks Erik's IP (82.165.222.127) via Cloudflare WAF.
WP REST API returns HTML block page instead of JSON → 0 product URLs
→ 0 scraped pages every run. Add SKIP_OPTCORE_SCRAPER guard matching
the existing SKIP_FS_SCRAPER pattern. Set in ecosystem.config.js on
Erik. Residential IP (Mac launchd) would be needed to use this scraper.
2026-04-18 05:41:56 +02:00
Rene Fichtmueller
e11e351f5e fix: crawlee-config clear request queue on each run
Crawlee's FileSystemStorage marks request URLs as HANDLED (state=4,
orderNo=null) after processing. With purgeOnStart=false these entries
persist, so on the next run crawler.run(startUrls) deduplicates them
→ requestsTotal=0 → immediate finish with 0 scraped pages.

Fix: rmSync request_queues/default/ before each makeCrawleeConfig()
call. Safe: session pool state lives in key_value_stores/, not in
request_queues/. Affects all Crawlee-based scrapers (ATGBICS, Optcore,
Switch-assets, etc.).
2026-04-18 05:37:45 +02:00
Rene Fichtmueller
1378a9bee8 chore: changelog — 10Gtek scraper fix (sfpcables.com, 49 prices) 2026-04-18 05:32:33 +02:00
Rene Fichtmueller
fcdd258369 fix: 10Gtek scraper now fetches prices from sfpcables.com
10gtek.com main site only exposes technical spec tables with no prices.
sfpcables.com is 10Gtek's own retail store and has both Model numbers
and USD prices in standard Magento product listings.

Changes:
- Switch scraping target from www.10gtek.com to sfpcables.com
- Parse Model: <part> + US.XX per product block (Magento structure)
- XFP fallback: extract part number from title after '|' separator
- Add fetchAllPages() with Magento loop-detection via seen-part dedup
- Remove QSFP-DD category (not available on sfpcables.com)
- Drop XFP-less categories from old 10gtek.com spec-table parser

Verified: 10/10 SFP prices, 10/10 SFP+ prices, 4/4 XFP prices on live site.
2026-04-18 05:27:49 +02:00
Rene Fichtmueller
2a6ec90ecd fix: fs-com Phase 1+2 crawler.run() ENOENT guard — Crawlee catches and re-throws the post-run _isTaskReadyFunction ENOENT internally, which rejected crawler.run() and aborted Phase 2 before it could start. Wrap both crawler.run() calls in try/catch to swallow ENOENT from request_queues paths; all processing is already complete at this point. 2026-04-18 03:52:49 +02:00
Rene Fichtmueller
b3eff15fc4 chore: changelog — daemon stability, ATGBICS Playwright, health monitor accuracy 2026-04-18 03:25:02 +02:00
Rene Fichtmueller
93d825dc04 fix: daemon stability + health monitor accuracy
- Add global unhandledRejection handler in scheduler daemon to swallow
  Crawlee's benign post-run ENOENT lock-file races (prevents process.exit(1))
- Add SKIP_FS_SCRAPER env var: skip FS.com worker on Erik where Cloudflare
  WAF blocks datacenter IPs (Mac launchd handles FS.com from residential IP)
- Remove FS.COM from health monitor EXPECTED_VENDORS (skipped on Erik)
- Health monitor: extend pg-boss lookup from 12h → 26h, add completed-job
  map; if job ran OK in last 26h + vendor has historical prices → mark
  STABLE instead of CRITICAL (fixes ATGBICS/Fluxlight hash-dedup false positives)
- Install Playwright Chromium on Erik (fixes ATGBICS BrowserLaunchError)
- Create missing Crawlee storage dirs on Erik (storage-fs-phase1/2,
  storage-ebay-transceivers) to prevent ENOENT on first Crawlee run
2026-04-18 03:16:59 +02:00
Rene Fichtmueller
8391b194a5 fix: GBICS scraper — fall back to aria-label-first pattern when href-first finds no priced products
Pattern 1 (href→aria-label) finds 127 navigation links on GBICS BigCommerce
pages — none contain GBP prices. Pattern 2 (aria-label→href) correctly
finds 16-30 product links per category page with £XX.XX prices in aria-labels.
The fallback from P1 to P2 now triggers when P1 finds results but none
contain '£', rather than only when P1 finds 0 total results.
2026-04-18 03:02:39 +02:00
Rene Fichtmueller
0e4e6ff6b2 chore: changelog — FS.com ENOENT fix, PID lock, health monitor tiered alerts 2026-04-18 02:55:18 +02:00
Rene Fichtmueller
24ff9822ac fix: improve scraper health monitor — tiered alerts, suppress stable-price false positives
Previous logic fired an alert whenever prices_6h=0, even when prices
were genuinely stable (content hash dedup prevents duplicate inserts).
This caused Flexoptix, ATGBICS and others to trigger alerts every 3h
despite their scrapers running successfully.

New logic:
  🔴 CRITICAL: last price > 7 days (genuine failure)
  🟡 WARNING:  last price 48h–7 days (possibly stale)
   STABLE:   last price ≤48h, 0 new (prices unchanged, scraper OK)

Also shows pg-boss job state/time alongside each vendor for faster
root-cause diagnosis. Trimmed EXPECTED_VENDORS to vendors with actual
scraper implementations (removed never-scraped placeholders).
2026-04-18 02:54:28 +02:00
Rene Fichtmueller
e552e08015 fix: suppress Crawlee post-run ENOENT unhandledRejection in fs-com.ts
After PlaywrightCrawler.run() resolves, Crawlee's internal task loop
schedules one final _isTaskReadyFunction call that tries to read a
request queue .json file already cleaned up during processing. This
ENOENT fires as an unhandledRejection and calls process.exit(1),
aborting Phase 2 before prices are written to the database.

Added a targeted unhandledRejection handler in the require.main block
that swallows ENOENT errors from request_queues paths (benign Crawlee
cleanup race) while re-raising all other rejections.
2026-04-18 02:51:00 +02:00
Rene Fichtmueller
6be2c131d3 fix: add PID lock to run-fs-scraper-mac.sh — prevent simultaneous instances
Adds /tmp/tip-fs-scraper.lock PID file to prevent launchd from running
a second instance while the previous one is still active (e.g. 2am
schedule fires, runs past 10am when launchd fires again). Without this,
concurrent instances caused rmSync(storage-fs-phase1) in one instance
to delete the Crawlee storage dir while the other was still using it,
resulting in ENOENT crashes.
2026-04-18 02:43:28 +02:00
Rene Fichtmueller
681fd5ced6 chore: gitignore all storage-* Crawlee dirs + local credentials 2026-04-18 02:40:34 +02:00
Rene Fichtmueller
c5e7e7d7f6 fix: remove POSTGRES_PASSWORD export from run-fs-scraper-mac.sh — sourced from ~/.tip/.env only 2026-04-18 02:37:42 +02:00
Rene Fichtmueller
fc3224be76 fix: remove hardcoded POSTGRES_PASSWORD from run-fs-scraper-mac.sh — use ~/.tip/.env 2026-04-18 02:37:05 +02:00
Rene Fichtmueller
696127366c chore: changelog — Playwright headless shell fix, withIsolatedStorage race fix, FS.com launchd fix 2026-04-18 02:35:55 +02:00
Rene Fichtmueller
419af4a24e fix: remove all withIsolatedStorage wrappers, add makeCrawleeConfig to remaining Crawlee scrapers
- scheduler.ts: remove withIsolatedStorage from ALL scrapers (atgbics,
  optcore, ufispace, edgecore, ebay-*, market-intel, community-issues,
  cisco, juniper, sonic, 10gtek, prolabs, switch-assets, fs)
  eliminates global CRAWLEE_STORAGE_DIR race condition entirely
- fs-com.ts: replace purgeDefaultStorages() with rmSync on isolated
  storage dirs (fs-phase1, fs-phase2); pass makeCrawleeConfig to both
  PlaywrightCrawler instances
- switch-assets-crawler.ts: add makeCrawleeConfig('switch-assets')
- switch-assets-playwright.ts: add makeCrawleeConfig('switch-assets-playwright')
- naddod.ts: restore clean error logging (remove debug instrumentation)
2026-04-18 02:19:53 +02:00