238 Commits

Author SHA1 Message Date
Rene Fichtmueller
2742141c8b feat: Playwright image scraper for bot-blocked vendors (Arista/Dell/Edgecore/Fortinet/Extreme) 2026-04-21 06:16:05 +02:00
Rene Fichtmueller
892da2bcf5 fix: Cisco line card URL mapping (8800/84/86 → 8000 family page, skip ASR9K logo-only) 2026-04-21 00:49:32 +02:00
Rene Fichtmueller
e20bb6cb45 fix: MikroTik hardcoded slug map for + models (crs305/312/317/326) 2026-04-21 00:45:41 +02:00
Rene Fichtmueller
c4585caada fix: Cisco 8000 builder URL + MikroTik lowercase + new vendor builders
URL builder fixes:
- Cisco 8000: update to new /site/us/en/ URL scheme (family page, not per-model)
- MikroTik: fix to lowercase+underscore format (was uppercase, caused 404)
- Fortinet: set to null — JS-rendered pages, all redirect to generic page
- Alcatel-Lucent Enterprise slug added to dispatcher (was missing, caused 0 hits)
- Add Quanta, Allied Telesis, Ufispace, Netgear URL builders
- NVIDIA: skip ConnectX/BlueField non-switch models

Migration 044:
- Clear 35 wrong NCS-5500 URLs from Cisco 8000-series models
- Pre-set correct 8000-series family URL for 21 models without images
2026-04-21 00:41:31 +02:00
Rene Fichtmueller
2b72e1089f fix: monitor-erik.sh — correct Erik SSH target + fix awk header skip 2026-04-21 00:34:28 +02:00
Rene Fichtmueller
ea6ef606d3 feat: more switch image coverage + system health metrics + Erik monitor
switch-image-fetcher:
- Add Fortinet URL builder (11 FortiSwitch models)
- Add Quanta Cloud Technology, Allied Telesis, Ufispace, Netgear URL builders
- Fix alcatel-lucent-enterprise slug missing from URL_BUILDERS dispatcher
- Fix NVIDIA builder to skip ConnectX/BlueField adapters (not switches)
- Add aruba slug alias for hpe-aruba

health endpoint:
- Add system metrics: CPU load (1/5/15m), memory usage, disk usage
- Add load_status indicator (ok/busy/overloaded)
- Expose process RSS memory
- Used by external monitors

scripts/monitor-erik.sh:
- Cron-ready health check script for Claudi (.82) and Raspberry Pis
- Checks TIP API health endpoint (load, memory, disk, DB latency)
- Checks PM2 process state via SSH (errored/stopped detection)
- ntfy.sh push notifications (set NTFY_TOPIC env var)
- Includes systemd service + timer unit comments for auto-install
2026-04-21 00:31:43 +02:00
Rene Fichtmueller
5d09b954f5 perf: load-aware scraper guard + higher rate limits + /tmp crawlee storage 2026-04-20 23:35:02 +02:00
Rene Fichtmueller
1d50fd1c8f feat: Flexoptix order section per switch + reject generic/logo images 2026-04-20 23:31:36 +02:00
Rene Fichtmueller
8de025cbcd fix: expand compatibility.verification_method CHECK to include vendor_compat + spec_match 2026-04-20 23:23:23 +02:00
Rene Fichtmueller
51af82bd14 fix(migration-041): use 'manual' doc_type instead of config_guide (CHECK constraint) 2026-04-20 23:17:41 +02:00
Rene Fichtmueller
983cd1bc0d fix: add certifications column to switches table (migration 039a) 2026-04-20 23:16:52 +02:00
Rene Fichtmueller
450e26a758 ui: update Finder quick examples to use actual seeded switch models
Replace placeholder models (93180YC-FX3, 7280R3A, QFX5120 — not in DB) with
the real seeded switches: N9K-C9364C, 93600CD-GX, 7060CX2-32S, QFX5130-32CD, SN5600
2026-04-20 23:00:36 +02:00
Rene Fichtmueller
b7e146c026 fix(community-issues): scrapeTransceiverCompatIssues falls back to ports_config when no compat entries 2026-04-20 23:00:00 +02:00
Rene Fichtmueller
02c8554f56 data: migration 041 — seed switch datasheets (Cisco, Arista, Juniper, NVIDIA)
Seeds known-good official datasheet PDFs and config guide URLs for the 24 initial
seeded switches. Directly visible in switch detail panel → Datasheets & Manuals.
2026-04-20 22:58:18 +02:00
Rene Fichtmueller
ddc49e4e2b feat: switch facts — migration 040 seeds power/weight/certifications + dashboard shows them
- Migration 040: seeds rack_units, typical_power_w, max_power_w, weight_kg, certifications
  for 23 initial switches (Cisco Nexus/Catalyst, Arista, Juniper, NVIDIA, Edgecore/Celestica/Asterfusion)
- Dashboard: Specifications section now shows Typical Power, Weight, Certifications (colored pills)
2026-04-20 22:56:53 +02:00
Rene Fichtmueller
fd36bd21c5 chore: update changelog for 2026-04-20 switch image + compat features 2026-04-20 22:53:24 +02:00
Rene Fichtmueller
15b5eba644 feat: compatibility panel — verification_method, competitor prices, spec-match collapsible
- API: getCompatibleTransceivers() returns verification_method, orders vendor_compat first
- Dashboard: Flexoptix section splits vendor-tested vs spec-match (collapsed)
- Dashboard: Competitor section shows vendor-tested with prices, spec-match as chips
2026-04-20 22:52:49 +02:00
Rene Fichtmueller
1ea73112c6 feat: Flexoptix compatibility scraper + transceiver issue scanner
- Add flexoptix-compat.ts: maps switch models to compatible Flexoptix transceivers
  via search API (vendor_compat) with form-factor fallback (spec_match)
  Scheduled daily at 09:00 UTC as scrape:compat:flexoptix
- Enhance community-issues.ts: add vendor advisory sources (Cisco Field Notices,
  Juniper KB, SONiC GitHub Issues) + new scrapeTransceiverCompatIssues() that
  searches for switch+transceiver combination problems specifically
- Scheduler: 59 schedules, 78 workers
2026-04-20 22:50:57 +02:00
Rene Fichtmueller
6cf1b188d8 feat: switch image fetcher + og:image scheduler job + dashboard thumbnail column
- Add switch-image-fetcher.ts: og:image-based image discovery for all 86 seeded switches
  (covers Cisco, Arista, Juniper, NVIDIA, Edgecore, Celestica, Asterfusion, Dell,
   HPE/Aruba, Huawei, Nokia, Extreme, MikroTik, Ubiquiti, FS.COM, Supermicro)
- Wire fetchSwitchImages() into scheduler as scrape:images:switches (daily 08:30 UTC)
- Dashboard: add 48px thumbnail column to switch table (lazy img with gear icon fallback)
2026-04-20 22:44:08 +02:00
Rene Fichtmueller
043fee46fc fix(vcelink): resolve TS 5.9 narrowing quirk with explicit cast in dead code
price?: number narrowing via typeof/!== undefined does not work for
arithmetic comparisons in TypeScript 5.9 dead code paths; use 'as number'
cast to keep the dead code compilable while the early-return guard above
prevents runtime execution entirely.
2026-04-20 22:18:13 +02:00
Rene Fichtmueller
2021651de2 fix(scrapers): fix ATGBics theme migration, NADDOD URL, disable VCELink
- ATGBics: update HTML parser from old card--product theme to new
  card__info theme (Shopify template changed April 2026); name now
  extracted from href link text instead of aria-label
- NADDOD: correct ensureVendor shop URL from /collections/transceivers
  (404) to /collection/optical-transceivers
- VCELink: disable scraper — site pivoted from optical transceivers to
  audio/video/cable products; all collection URLs return 404
2026-04-20 22:11:24 +02:00
Rene Fichtmueller
5ee9904b04 ui: comprehensive DEMO/MODELL tagging across all dashboard sections with synthetic data
- Stock tab nav: ⚠DEMO badge
- Stock section subtitle: clarify prices=real vs. Lager/Verkauf=DEMO
- Stat cards: DE-Lager, Global-Lager, Nachlieferung labels tagged [DEMO]
- Recently Restocked header: DEMO DATA badge
- Stock detail lookup: [demo] inline on all warehouse/units_sold fields
- Top Sellers: already tagged (previous commit)
- Procurement > Reorder Signals: DEMO DATA banner (based on synthetic ABC data)
- Procurement > ABC Classification: DEMO DATA banner
- Hype Cycle: MODELL badge on header (Norton-Bass = mathematical estimate)
- Hype Cycle table: Adoption/Peak/To Plateau columns tagged [M] = Modell
- Hype Cycle legend: explains [M] vs real data
- Market Intelligence + Lifecycle Events: no tag (real scraped data)
2026-04-20 21:52:10 +02:00
Rene Fichtmueller
577407ced6 ui: mark Top Sellers widget data as DEMO (synthetic seed data, not real sales) 2026-04-20 21:44:33 +02:00
Rene Fichtmueller
eeb96cb2ab fix: price-comparison SKU lookup — wrong column refs (so.stock_level, search_url_template) 2026-04-19 00:12:18 +02:00
Rene Fichtmueller
5405685d24 feat: register fiber24 + fibermall in index, move atgbics to fetch-only section 2026-04-18 22:50:52 +02:00
Rene Fichtmueller
017ed78d2b feat: rewrite ATGBICS scraper — static HTML, correct collection handles, GBP cookie
- Replaces Playwright with pure fetch() — static HTML has prices
- Correct collection handles (compatible-transceivers-sfpp-10g etc.)
- Cookie: cart_currency=GBP forces GBP pricing from any geo-IP
- Handles 35+ pages per category × 24 products = 840+ SFP+ products
- No IP-blocking with static HTML (Playwright was the trigger)
- Adds scripts/run-atgbics-mac.sh for Mac-side runner if needed
2026-04-18 22:48:29 +02:00
Rene Fichtmueller
29b9724bb4 fix: fiber24 stockLevel on_request (was unknown — violated DB constraint) 2026-04-18 22:26:45 +02:00
Rene Fichtmueller
d2993ba698 fix: ShopFiber24 sitemap-based scraping + Fibermall image extraction
ShopFiber24 (fiber24.ts):
- Complete rewrite: was using JS-rendered catalog (all prices = 0)
- New strategy: fetch sitemap_0.xml.gz → 310 product DE-URLs
- Each product page has Schema.org microdata: itemprop=price, sku, image
- Extracts: price (minPrice), SKU, image_url, name, specs
- Rate: 1 req/1.5s, no Playwright needed

FiberMall (fibermall.ts):
- Add imageUrl to Product interface
- Extract first fibermall.com/photo/*.jpg from product listing card
- Write image_url to transceivers table (has_image=true) on upsert
- SKU variants share parent product image
- 304 FiberMall transceivers will get images on next scraper run
2026-04-18 22:20:57 +02:00
Rene Fichtmueller
7718356327 feat: side-by-side competitor comparison + fix 1.6T speed_gbps
- Fix OSFP-DR8-1.6T-FL and OSFP-2FR4-1.6T-FL: speed_gbps was 200, now 1600
  → FS.com 1.6T products now correctly match as comparables for Flexoptix O.1316T.C.05.M
- API: extend comparable price query to return comp_form_factor, comp_speed_gbps,
  comp_reach_meters, comp_reach_label, comp_fiber_type, comp_wavelengths
- Dashboard: replace plain comparable price row with side-by-side spec comparison card
  showing Flexoptix vs. competitor: Form Factor, Speed, Reach, Fiber, Wavelengths
  with color coding (green=match, orange=mismatch) and savings badge (−45% günstiger)
2026-04-18 21:51:41 +02:00
Rene Fichtmueller
2ebba07bb0 feat: add claude-code LLM provider + update dashboard to fo-blog-v5
- client.ts: add claude-code provider routing BLOG_LLM_PROVIDER=claude-code
  to claude-bridge (flat-rate, no API billing via Claude Code subscription)
- checkHealth() now pings /health on claude-bridge for real availability check
- Default OLLAMA_LLM_MODEL changed from qwen2.5:14b to fo-blog-v5
- Dashboard: add claude-code card (EMPFOHLEN), rename fo-blog-v3 → fo-blog-v5
- loadBlogLLMStatus() handles all 3 providers: claude-code/anthropic/ollama
- Grid expanded from 3 to 4 columns to accommodate new card
- ecosystem.config.js + .env on Erik: OLLAMA_LLM_MODEL=fo-blog-v5 confirmed
2026-04-18 20:45:14 +02:00
Rene Fichtmueller
b9bdcd6fc6 docs: changelog — tunnel DNS fix, image backfill, OSFP coverage 2026-04-18 13:23:57 +02:00
Rene Fichtmueller
ff0cee2e80 fix: FS.com price extraction — DOM-based prices + shipping-context exclusion
- All 247 FS.com prices were €79 (shipping threshold, not product prices)
- Root cause: 'Gratis Versand ab 79 € (ohne MwSt.)' banner matched first
- Fix 1: DOM price extraction in page.evaluate with bad-parent skip list
- Fix 2: bodyText qualified patterns skip matches near shipping keywords
- Fix 3: waitForSelector for price DOM element before evaluate
- Fix 4: Deleted 247 invalid €79 observations from DB

Also included from previous session:
- db.ts: set has_image=true on image writes (fix 632 desync rows)
- spec-updater.ts: DR/FR/LR/ER/ZR → SMF, SR → MMF fiber type inference
2026-04-18 13:10:35 +02:00
Rene Fichtmueller
0e91e8b11c fix: add missing auth header to blog generate fetches
Both generateBlog() and generateBlogManual() were calling
POST /api/blog/generate without an Authorization: Bearer header.
The requireAuth middleware correctly returned 401, which appeared
as 'Unauthorized — please log in' toast in the dashboard.

Fix: read loadToken() before each fetch and include the token in
the Authorization header. Also add r.status===401 guard to redirect
to login page when token expires, instead of showing error toast.
2026-04-18 08:03:39 +02:00
Rene Fichtmueller
e1e390231f chore: changelog — Crawlee queue wipe, ATGBICS fix, Optcore skip 2026-04-18 05:42:37 +02:00
Rene Fichtmueller
6b39bb0930 fix: skip Optcore on Erik — Cloudflare blocks datacenter IP
optcore.net blocks Erik's IP (82.165.222.127) via Cloudflare WAF.
WP REST API returns HTML block page instead of JSON → 0 product URLs
→ 0 scraped pages every run. Add SKIP_OPTCORE_SCRAPER guard matching
the existing SKIP_FS_SCRAPER pattern. Set in ecosystem.config.js on
Erik. Residential IP (Mac launchd) would be needed to use this scraper.
2026-04-18 05:41:56 +02:00
Rene Fichtmueller
1d79094872 fix: crawlee-config clear request queue on each run
Crawlee's FileSystemStorage marks request URLs as HANDLED (state=4,
orderNo=null) after processing. With purgeOnStart=false these entries
persist, so on the next run crawler.run(startUrls) deduplicates them
→ requestsTotal=0 → immediate finish with 0 scraped pages.

Fix: rmSync request_queues/default/ before each makeCrawleeConfig()
call. Safe: session pool state lives in key_value_stores/, not in
request_queues/. Affects all Crawlee-based scrapers (ATGBICS, Optcore,
Switch-assets, etc.).
2026-04-18 05:37:45 +02:00
Rene Fichtmueller
19ff1a779b chore: changelog — 10Gtek scraper fix (sfpcables.com, 49 prices) 2026-04-18 05:32:33 +02:00
Rene Fichtmueller
eed599cc2c fix: 10Gtek scraper now fetches prices from sfpcables.com
10gtek.com main site only exposes technical spec tables with no prices.
sfpcables.com is 10Gtek's own retail store and has both Model numbers
and USD prices in standard Magento product listings.

Changes:
- Switch scraping target from www.10gtek.com to sfpcables.com
- Parse Model: <part> + US.XX per product block (Magento structure)
- XFP fallback: extract part number from title after '|' separator
- Add fetchAllPages() with Magento loop-detection via seen-part dedup
- Remove QSFP-DD category (not available on sfpcables.com)
- Drop XFP-less categories from old 10gtek.com spec-table parser

Verified: 10/10 SFP prices, 10/10 SFP+ prices, 4/4 XFP prices on live site.
2026-04-18 05:27:49 +02:00
Rene Fichtmueller
582965ecb5 fix: fs-com Phase 1+2 crawler.run() ENOENT guard — Crawlee catches and re-throws the post-run _isTaskReadyFunction ENOENT internally, which rejected crawler.run() and aborted Phase 2 before it could start. Wrap both crawler.run() calls in try/catch to swallow ENOENT from request_queues paths; all processing is already complete at this point. 2026-04-18 03:52:49 +02:00
Rene Fichtmueller
2304a65227 chore: changelog — daemon stability, ATGBICS Playwright, health monitor accuracy 2026-04-18 03:25:02 +02:00
Rene Fichtmueller
71936784fc fix: daemon stability + health monitor accuracy
- Add global unhandledRejection handler in scheduler daemon to swallow
  Crawlee's benign post-run ENOENT lock-file races (prevents process.exit(1))
- Add SKIP_FS_SCRAPER env var: skip FS.com worker on Erik where Cloudflare
  WAF blocks datacenter IPs (Mac launchd handles FS.com from residential IP)
- Remove FS.COM from health monitor EXPECTED_VENDORS (skipped on Erik)
- Health monitor: extend pg-boss lookup from 12h → 26h, add completed-job
  map; if job ran OK in last 26h + vendor has historical prices → mark
  STABLE instead of CRITICAL (fixes ATGBICS/Fluxlight hash-dedup false positives)
- Install Playwright Chromium on Erik (fixes ATGBICS BrowserLaunchError)
- Create missing Crawlee storage dirs on Erik (storage-fs-phase1/2,
  storage-ebay-transceivers) to prevent ENOENT on first Crawlee run
2026-04-18 03:16:59 +02:00
Rene Fichtmueller
4797fccd7f fix: GBICS scraper — fall back to aria-label-first pattern when href-first finds no priced products
Pattern 1 (href→aria-label) finds 127 navigation links on GBICS BigCommerce
pages — none contain GBP prices. Pattern 2 (aria-label→href) correctly
finds 16-30 product links per category page with £XX.XX prices in aria-labels.
The fallback from P1 to P2 now triggers when P1 finds results but none
contain '£', rather than only when P1 finds 0 total results.
2026-04-18 03:02:39 +02:00
Rene Fichtmueller
f191ece0e4 chore: changelog — FS.com ENOENT fix, PID lock, health monitor tiered alerts 2026-04-18 02:55:18 +02:00
Rene Fichtmueller
84eb6e3149 fix: improve scraper health monitor — tiered alerts, suppress stable-price false positives
Previous logic fired an alert whenever prices_6h=0, even when prices
were genuinely stable (content hash dedup prevents duplicate inserts).
This caused Flexoptix, ATGBICS and others to trigger alerts every 3h
despite their scrapers running successfully.

New logic:
  🔴 CRITICAL: last price > 7 days (genuine failure)
  🟡 WARNING:  last price 48h–7 days (possibly stale)
   STABLE:   last price ≤48h, 0 new (prices unchanged, scraper OK)

Also shows pg-boss job state/time alongside each vendor for faster
root-cause diagnosis. Trimmed EXPECTED_VENDORS to vendors with actual
scraper implementations (removed never-scraped placeholders).
2026-04-18 02:54:28 +02:00
Rene Fichtmueller
4d94aa20ba fix: suppress Crawlee post-run ENOENT unhandledRejection in fs-com.ts
After PlaywrightCrawler.run() resolves, Crawlee's internal task loop
schedules one final _isTaskReadyFunction call that tries to read a
request queue .json file already cleaned up during processing. This
ENOENT fires as an unhandledRejection and calls process.exit(1),
aborting Phase 2 before prices are written to the database.

Added a targeted unhandledRejection handler in the require.main block
that swallows ENOENT errors from request_queues paths (benign Crawlee
cleanup race) while re-raising all other rejections.
2026-04-18 02:51:00 +02:00
Rene Fichtmueller
1c5805ab96 fix: add PID lock to run-fs-scraper-mac.sh — prevent simultaneous instances
Adds /tmp/tip-fs-scraper.lock PID file to prevent launchd from running
a second instance while the previous one is still active (e.g. 2am
schedule fires, runs past 10am when launchd fires again). Without this,
concurrent instances caused rmSync(storage-fs-phase1) in one instance
to delete the Crawlee storage dir while the other was still using it,
resulting in ENOENT crashes.
2026-04-18 02:43:28 +02:00
Rene Fichtmueller
306f329d5a chore: gitignore all storage-* Crawlee dirs + local credentials 2026-04-18 02:40:34 +02:00
Rene Fichtmueller
53836a14a8 fix: remove POSTGRES_PASSWORD export from run-fs-scraper-mac.sh — sourced from ~/.tip/.env only 2026-04-18 02:37:42 +02:00
Rene Fichtmueller
7675a939a1 fix: remove hardcoded POSTGRES_PASSWORD from run-fs-scraper-mac.sh — use ~/.tip/.env 2026-04-18 02:37:05 +02:00
Rene Fichtmueller
a1db0a7a90 chore: changelog — Playwright headless shell fix, withIsolatedStorage race fix, FS.com launchd fix 2026-04-18 02:35:55 +02:00