225 Commits

Author SHA1 Message Date
Rene Fichtmueller
aa91798e8d fix(vcelink): resolve TS 5.9 narrowing quirk with explicit cast in dead code
price?: number narrowing via typeof/!== undefined does not work for
arithmetic comparisons in TypeScript 5.9 dead code paths; use 'as number'
cast to keep the dead code compilable while the early-return guard above
prevents runtime execution entirely.
2026-04-20 22:18:13 +02:00
Rene Fichtmueller
1aba912a15 fix(scrapers): fix ATGBics theme migration, NADDOD URL, disable VCELink
- ATGBics: update HTML parser from old card--product theme to new
  card__info theme (Shopify template changed April 2026); name now
  extracted from href link text instead of aria-label
- NADDOD: correct ensureVendor shop URL from /collections/transceivers
  (404) to /collection/optical-transceivers
- VCELink: disable scraper — site pivoted from optical transceivers to
  audio/video/cable products; all collection URLs return 404
2026-04-20 22:11:24 +02:00
Rene Fichtmueller
ca943f1f86 ui: comprehensive DEMO/MODELL tagging across all dashboard sections with synthetic data
- Stock tab nav: ⚠DEMO badge
- Stock section subtitle: clarify prices=real vs. Lager/Verkauf=DEMO
- Stat cards: DE-Lager, Global-Lager, Nachlieferung labels tagged [DEMO]
- Recently Restocked header: DEMO DATA badge
- Stock detail lookup: [demo] inline on all warehouse/units_sold fields
- Top Sellers: already tagged (previous commit)
- Procurement > Reorder Signals: DEMO DATA banner (based on synthetic ABC data)
- Procurement > ABC Classification: DEMO DATA banner
- Hype Cycle: MODELL badge on header (Norton-Bass = mathematical estimate)
- Hype Cycle table: Adoption/Peak/To Plateau columns tagged [M] = Modell
- Hype Cycle legend: explains [M] vs real data
- Market Intelligence + Lifecycle Events: no tag (real scraped data)
2026-04-20 21:52:10 +02:00
Rene Fichtmueller
9f3cd46f9c ui: mark Top Sellers widget data as DEMO (synthetic seed data, not real sales) 2026-04-20 21:44:33 +02:00
Rene Fichtmueller
0fb4850dfa fix: price-comparison SKU lookup — wrong column refs (so.stock_level, search_url_template) 2026-04-19 00:12:18 +02:00
Rene Fichtmueller
b0ed54f386 feat: register fiber24 + fibermall in index, move atgbics to fetch-only section 2026-04-18 22:50:52 +02:00
Rene Fichtmueller
cb5a587d7e feat: rewrite ATGBICS scraper — static HTML, correct collection handles, GBP cookie
- Replaces Playwright with pure fetch() — static HTML has prices
- Correct collection handles (compatible-transceivers-sfpp-10g etc.)
- Cookie: cart_currency=GBP forces GBP pricing from any geo-IP
- Handles 35+ pages per category × 24 products = 840+ SFP+ products
- No IP-blocking with static HTML (Playwright was the trigger)
- Adds scripts/run-atgbics-mac.sh for Mac-side runner if needed
2026-04-18 22:48:29 +02:00
Rene Fichtmueller
785a6731ab fix: fiber24 stockLevel on_request (was unknown — violated DB constraint) 2026-04-18 22:26:45 +02:00
Rene Fichtmueller
d4ad9f4641 fix: ShopFiber24 sitemap-based scraping + Fibermall image extraction
ShopFiber24 (fiber24.ts):
- Complete rewrite: was using JS-rendered catalog (all prices = 0)
- New strategy: fetch sitemap_0.xml.gz → 310 product DE-URLs
- Each product page has Schema.org microdata: itemprop=price, sku, image
- Extracts: price (minPrice), SKU, image_url, name, specs
- Rate: 1 req/1.5s, no Playwright needed

FiberMall (fibermall.ts):
- Add imageUrl to Product interface
- Extract first fibermall.com/photo/*.jpg from product listing card
- Write image_url to transceivers table (has_image=true) on upsert
- SKU variants share parent product image
- 304 FiberMall transceivers will get images on next scraper run
2026-04-18 22:20:57 +02:00
Rene Fichtmueller
446ac667b0 feat: side-by-side competitor comparison + fix 1.6T speed_gbps
- Fix OSFP-DR8-1.6T-FL and OSFP-2FR4-1.6T-FL: speed_gbps was 200, now 1600
  → FS.com 1.6T products now correctly match as comparables for Flexoptix O.1316T.C.05.M
- API: extend comparable price query to return comp_form_factor, comp_speed_gbps,
  comp_reach_meters, comp_reach_label, comp_fiber_type, comp_wavelengths
- Dashboard: replace plain comparable price row with side-by-side spec comparison card
  showing Flexoptix vs. competitor: Form Factor, Speed, Reach, Fiber, Wavelengths
  with color coding (green=match, orange=mismatch) and savings badge (−45% günstiger)
2026-04-18 21:51:41 +02:00
Rene Fichtmueller
62d97a783c feat: add claude-code LLM provider + update dashboard to fo-blog-v5
- client.ts: add claude-code provider routing BLOG_LLM_PROVIDER=claude-code
  to claude-bridge (flat-rate, no API billing via Claude Code subscription)
- checkHealth() now pings /health on claude-bridge for real availability check
- Default OLLAMA_LLM_MODEL changed from qwen2.5:14b to fo-blog-v5
- Dashboard: add claude-code card (EMPFOHLEN), rename fo-blog-v3 → fo-blog-v5
- loadBlogLLMStatus() handles all 3 providers: claude-code/anthropic/ollama
- Grid expanded from 3 to 4 columns to accommodate new card
- ecosystem.config.js + .env on Erik: OLLAMA_LLM_MODEL=fo-blog-v5 confirmed
2026-04-18 20:45:14 +02:00
Rene Fichtmueller
1da4abc488 fix: FS.com price extraction — DOM-based prices + shipping-context exclusion
- All 247 FS.com prices were €79 (shipping threshold, not product prices)
- Root cause: 'Gratis Versand ab 79 € (ohne MwSt.)' banner matched first
- Fix 1: DOM price extraction in page.evaluate with bad-parent skip list
- Fix 2: bodyText qualified patterns skip matches near shipping keywords
- Fix 3: waitForSelector for price DOM element before evaluate
- Fix 4: Deleted 247 invalid €79 observations from DB

Also included from previous session:
- db.ts: set has_image=true on image writes (fix 632 desync rows)
- spec-updater.ts: DR/FR/LR/ER/ZR → SMF, SR → MMF fiber type inference
2026-04-18 13:10:35 +02:00
Rene Fichtmueller
f8a1d27e79 fix: add missing auth header to blog generate fetches
Both generateBlog() and generateBlogManual() were calling
POST /api/blog/generate without an Authorization: Bearer header.
The requireAuth middleware correctly returned 401, which appeared
as 'Unauthorized — please log in' toast in the dashboard.

Fix: read loadToken() before each fetch and include the token in
the Authorization header. Also add r.status===401 guard to redirect
to login page when token expires, instead of showing error toast.
2026-04-18 08:03:39 +02:00
Rene Fichtmueller
48adcd3fc9 fix: skip Optcore on Erik — Cloudflare blocks datacenter IP
optcore.net blocks Erik's IP (82.165.222.127) via Cloudflare WAF.
WP REST API returns HTML block page instead of JSON → 0 product URLs
→ 0 scraped pages every run. Add SKIP_OPTCORE_SCRAPER guard matching
the existing SKIP_FS_SCRAPER pattern. Set in ecosystem.config.js on
Erik. Residential IP (Mac launchd) would be needed to use this scraper.
2026-04-18 05:41:56 +02:00
Rene Fichtmueller
e11e351f5e fix: crawlee-config clear request queue on each run
Crawlee's FileSystemStorage marks request URLs as HANDLED (state=4,
orderNo=null) after processing. With purgeOnStart=false these entries
persist, so on the next run crawler.run(startUrls) deduplicates them
→ requestsTotal=0 → immediate finish with 0 scraped pages.

Fix: rmSync request_queues/default/ before each makeCrawleeConfig()
call. Safe: session pool state lives in key_value_stores/, not in
request_queues/. Affects all Crawlee-based scrapers (ATGBICS, Optcore,
Switch-assets, etc.).
2026-04-18 05:37:45 +02:00
Rene Fichtmueller
fcdd258369 fix: 10Gtek scraper now fetches prices from sfpcables.com
10gtek.com main site only exposes technical spec tables with no prices.
sfpcables.com is 10Gtek's own retail store and has both Model numbers
and USD prices in standard Magento product listings.

Changes:
- Switch scraping target from www.10gtek.com to sfpcables.com
- Parse Model: <part> + US.XX per product block (Magento structure)
- XFP fallback: extract part number from title after '|' separator
- Add fetchAllPages() with Magento loop-detection via seen-part dedup
- Remove QSFP-DD category (not available on sfpcables.com)
- Drop XFP-less categories from old 10gtek.com spec-table parser

Verified: 10/10 SFP prices, 10/10 SFP+ prices, 4/4 XFP prices on live site.
2026-04-18 05:27:49 +02:00
Rene Fichtmueller
2a6ec90ecd fix: fs-com Phase 1+2 crawler.run() ENOENT guard — Crawlee catches and re-throws the post-run _isTaskReadyFunction ENOENT internally, which rejected crawler.run() and aborted Phase 2 before it could start. Wrap both crawler.run() calls in try/catch to swallow ENOENT from request_queues paths; all processing is already complete at this point. 2026-04-18 03:52:49 +02:00
Rene Fichtmueller
93d825dc04 fix: daemon stability + health monitor accuracy
- Add global unhandledRejection handler in scheduler daemon to swallow
  Crawlee's benign post-run ENOENT lock-file races (prevents process.exit(1))
- Add SKIP_FS_SCRAPER env var: skip FS.com worker on Erik where Cloudflare
  WAF blocks datacenter IPs (Mac launchd handles FS.com from residential IP)
- Remove FS.COM from health monitor EXPECTED_VENDORS (skipped on Erik)
- Health monitor: extend pg-boss lookup from 12h → 26h, add completed-job
  map; if job ran OK in last 26h + vendor has historical prices → mark
  STABLE instead of CRITICAL (fixes ATGBICS/Fluxlight hash-dedup false positives)
- Install Playwright Chromium on Erik (fixes ATGBICS BrowserLaunchError)
- Create missing Crawlee storage dirs on Erik (storage-fs-phase1/2,
  storage-ebay-transceivers) to prevent ENOENT on first Crawlee run
2026-04-18 03:16:59 +02:00
Rene Fichtmueller
8391b194a5 fix: GBICS scraper — fall back to aria-label-first pattern when href-first finds no priced products
Pattern 1 (href→aria-label) finds 127 navigation links on GBICS BigCommerce
pages — none contain GBP prices. Pattern 2 (aria-label→href) correctly
finds 16-30 product links per category page with £XX.XX prices in aria-labels.
The fallback from P1 to P2 now triggers when P1 finds results but none
contain '£', rather than only when P1 finds 0 total results.
2026-04-18 03:02:39 +02:00
Rene Fichtmueller
24ff9822ac fix: improve scraper health monitor — tiered alerts, suppress stable-price false positives
Previous logic fired an alert whenever prices_6h=0, even when prices
were genuinely stable (content hash dedup prevents duplicate inserts).
This caused Flexoptix, ATGBICS and others to trigger alerts every 3h
despite their scrapers running successfully.

New logic:
  🔴 CRITICAL: last price > 7 days (genuine failure)
  🟡 WARNING:  last price 48h–7 days (possibly stale)
   STABLE:   last price ≤48h, 0 new (prices unchanged, scraper OK)

Also shows pg-boss job state/time alongside each vendor for faster
root-cause diagnosis. Trimmed EXPECTED_VENDORS to vendors with actual
scraper implementations (removed never-scraped placeholders).
2026-04-18 02:54:28 +02:00
Rene Fichtmueller
e552e08015 fix: suppress Crawlee post-run ENOENT unhandledRejection in fs-com.ts
After PlaywrightCrawler.run() resolves, Crawlee's internal task loop
schedules one final _isTaskReadyFunction call that tries to read a
request queue .json file already cleaned up during processing. This
ENOENT fires as an unhandledRejection and calls process.exit(1),
aborting Phase 2 before prices are written to the database.

Added a targeted unhandledRejection handler in the require.main block
that swallows ENOENT errors from request_queues paths (benign Crawlee
cleanup race) while re-raising all other rejections.
2026-04-18 02:51:00 +02:00
Rene Fichtmueller
419af4a24e fix: remove all withIsolatedStorage wrappers, add makeCrawleeConfig to remaining Crawlee scrapers
- scheduler.ts: remove withIsolatedStorage from ALL scrapers (atgbics,
  optcore, ufispace, edgecore, ebay-*, market-intel, community-issues,
  cisco, juniper, sonic, 10gtek, prolabs, switch-assets, fs)
  eliminates global CRAWLEE_STORAGE_DIR race condition entirely
- fs-com.ts: replace purgeDefaultStorages() with rmSync on isolated
  storage dirs (fs-phase1, fs-phase2); pass makeCrawleeConfig to both
  PlaywrightCrawler instances
- switch-assets-crawler.ts: add makeCrawleeConfig('switch-assets')
- switch-assets-playwright.ts: add makeCrawleeConfig('switch-assets-playwright')
- naddod.ts: restore clean error logging (remove debug instrumentation)
2026-04-18 02:19:53 +02:00
Rene Fichtmueller
d9e5331161 debug: widen NADDOD error slice to 300 chars, add pre-insert logging 2026-04-18 02:00:03 +02:00
Rene Fichtmueller
24481b09e6 fix: eBay enricher Crawlee isolation + ephemeral queues
- Add makeCrawleeConfig isolation to CheerioCrawler instances
- Switch from named persistent RequestQueue to ephemeral null queues:
  named queues retain 'handled' state and skip all URLs on re-runs,
  causing 0 observations on every run after the first.
- Applies to both enrichSwitchFromEbay and enrichTransceiversFromEbay.
2026-04-18 01:42:08 +02:00
Rene Fichtmueller
c7d7456de9 fix: instance-level Crawlee storage isolation + eBay vendor type
- Add utils/crawlee-config.ts: makeCrawleeConfig(name) returns a
  Crawlee Configuration with isolated localDataDirectory per scraper.
  Uses storageClientOptions (not global CRAWLEE_STORAGE_DIR) so
  concurrent pg-boss workers in the same process don't race on
  the shared env var.

- Apply makeCrawleeConfig to all 6 Crawlee-based scrapers:
  optcore (PlaywrightCrawler), atgbics (PlaywrightCrawler),
  community-issues (CheerioCrawler + RequestQueue),
  edgecore (CheerioCrawler), ufispace (CheerioCrawler),
  market-intelligence (CheerioCrawler).

- scheduler.ts: add withIsolatedStorage for optcore and market-intel
  workers (was missing, caused storage-fs path bleed from fs scraper).

- ebay-enricher.ts: fix vendor type 'marketplace' -> 'reseller' to
  satisfy vendors_type_check constraint
  ['manufacturer','distributor','oem','reseller','compatible'].
2026-04-18 01:35:57 +02:00
Rene Fichtmueller
4b751a771b fix: NADDOD stockLevel 'unknown' → 'on_request' — invalid value for price_observations check constraint 2026-04-18 01:21:31 +02:00
Rene Fichtmueller
2b770aa1a9 chore: cleanup — rename digikey→mouser, remove orphan files, gitignore Crawlee artifacts
- Rename scrapers/digikey.ts → scrapers/mouser.ts: export scrapeMouser()
  (file was Mouser API implementation mislabeled from task origin)
- Fix scheduler.ts mouser-oem worker: import scrapeMouser from ./scrapers/mouser
- Delete switch-seed-smb.ts (unreferenced, no CLI flag, no scheduler job)
- Add storage/, storage-fs/, .crawlee/ to .gitignore (Crawlee runtime artifacts)
2026-04-18 01:09:10 +02:00
Rene Fichtmueller
1c8dec52c9 feat: Price Comparison dashboard + Eoptolink OEM scraper
- Add public /api/price-comparison API (summary, top-50, per-SKU detail)
  — no auth required, 3 Express routes, DISTINCT ON latest-price logic
- Add '💲 Price Comparison' dashboard tab: stat cards, form-factor
  breakdown, top-50 SKU table (clickable rows → SKU detail), per-vendor
  price + stock + spread% lookup panel
- Add Eoptolink OEM catalog scraper (93 product-solution pages,
  part-number regex EOLO-*/EOLQ-* etc., no prices, seeds transceivers
  table as manufacturer entries)
- Register scrape:catalog:eoptolink in scheduler: schedule every 4h
  (40 */4 * * *), lazy-import worker, added to known-jobs array
2026-04-18 01:02:08 +02:00
Rene Fichtmueller
e9fcda2811 feat: wire finder.ts + switch-docs + Ollama LLM tools to MCP server
MCP Server (packages/mcp-server/src/index.ts):
- Register registerSwitchDocTools (switch-docs.ts) — switch documentation lookup
- Register finderTools dynamically (finder.ts) — find_flexoptix_for_switch, get_competitor_alerts
- Add analyze_market_with_llm tool: qwen2.5:14b via Ollama, enriched with live hype cycle + pricing + news
- Add generate_blog_post tool: fo-blog-v5 (fine-tuned) with qwen2.5:14b fallback, enriched with live pricing data
- OLLAMA_BASE_URL env var (default: https://ollama.fichtmueller.org)

Also includes scraper improvements (ascentoptics, atgbics, gbics, skylane, ebay-enricher),
API route updates (blog, blog-sll, health, hot-topics, transceivers, queries),
and dashboard hot-topics refresh.
2026-04-18 00:21:58 +02:00
Rene Fichtmueller
b88a6e28cf feat: /api/hype-cycle/analysis endpoint — DB-backed Bass-fitted results from hype_cycle_analysis table 2026-04-18 00:11:08 +02:00
Rene Fichtmueller
9d3019d0c0 feat: Norton-Bass Hype Cycle Engine — market_metrics seed + Bass fitting + Gartner phase detection 2026-04-18 00:09:08 +02:00
Rene Fichtmueller
75cea9fe90 feat: Mouser Electronics API scraper for OEM reference prices (Juniper/Cisco/Arista PIDs) 2026-04-18 00:04:35 +02:00
Rene Fichtmueller
861243ea3f feat: stock confidence badges, multi-vendor price comparison, expanded Cisco TMG + Juniper HCT
Stock API & Dashboard:
- /api/stock/summary: vendor_breakdown adds avg_confidence, currencies, conf_per_warehouse/aggregated/boolean
- /api/stock/summary: new price_comparison endpoint (multi-vendor SKUs, min/max/avg price)
- /api/stock/summary: totals adds multi_vendor_skus count
- Dashboard: 6th stat card (Multi-Vendor SKUs), confidence badge column (🟢 L3 / 🟡 L2 /  L1)
- Dashboard: price comparison table with vendor-by-vendor price breakdown
- Dashboard: subtitle updated to include QSFPTEK + NADDOD
- Dashboard: top sellers link to product URLs

Cisco TMG improvements:
- Added 5 new platform families: 8000 Series, NCS5500, NCS540, NCS560, NCS1000
- Per-device query strategy: iterates all switch model IDs from family filter
  instead of getting only 1 switch per family → 58 switches per N9300 run
- Graceful error handling per device with rate limiting (1s between requests)

Juniper HCT: ran manually → 475 Juniper-brand transceivers seeded
2026-04-17 23:33:31 +02:00
Rene Fichtmueller
5393f73c17 feat: stock quality schema + QSFPTEK/NADDOD v2 scrapers with real-time stock counts
- Migration 028 (retroactive): document warehouse columns added to stock_observations
- Migration 037: composite indexes for DISTINCT ON (transceiver_id, source_vendor_id) queries
- Migration 038: add stock_confidence (1/2/3), price_currency, price_includes_tax,
  stock_vendor_ts to stock_observations + TRUNCATE test-run data

db.ts: upsertStockObservation now accepts stockConfidence, priceCurrency,
priceIncludesTax, stockVendorTs; delta detection includes quantity_available

fs-com.ts: passes stockConfidence=3 + priceCurrency=EUR + priceIncludesTax=false

qsfptek.ts v2: Phase 1 API listing + Phase 2 detail-page stock extraction
- Parses 'X in real-time stock, DATE' from product detail pages
- Writes stock_observations with confidence=2 + stockVendorTs
- Up to 500 detail pages/run at 2s rate limit

naddod.ts v2: complete rewrite from WooCommerce to Astro sitemap-based
- Discovers products via /sitemaps/products.xml (600+ products)
- URL format: /products/XXXXX.html
- Extracts 'In Stock: X' exact counts from SSR HTML
- Writes both price + stock observations (confidence 1 or 2)
2026-04-17 22:54:40 +02:00
Rene Fichtmueller
5b35b2b8be feat(scraper+api): warehouse stock data pipeline — FS.com v2, SmartOptics v2, Stock API
Scraper changes:
- fs-com.ts v2: Playwright stealth patches + www.fs.com/de/ URL fix (de.fs.com DNS NXDOMAIN).
  Extracts DE-Lager, Global-Lager, Nachlieferung, units_sold, compatible_brands, price_net.
  Mac-side runner (run-fs-scraper-mac.sh) via SSH tunnel for residential IP access.
  Fast-fail connectivity check on datacenter IPs that are blocked by Cloudflare.
- smartoptics.ts v2: WooCommerce REST API fallback + 8 catalog categories + relative URL fix.
  Was finding only 8 products, now discovers 18+ with multi-category crawl.

DB layer:
- db.ts: add upsertStockObservation() — writes 10 new stock_observations columns
  (warehouse_de_qty, warehouse_global_qty, backorder_qty, units_sold, compatible_brands,
  price_net, product_url, delivery dates) with dedup check.

API:
- routes/stock.ts: GET /api/stock, /api/stock/summary, /api/stock/:id
  Warehouse breakdowns per transceiver/vendor with top-sellers and vendor summary.
- routes/review.ts: equivalence review queue (approve/reject/bulk-approve).
- index.ts: register /api/stock and /api/review routes.

Dashboard:
- index.html: 🏭 Stock tab with stat cards (DE-Lager, Global-Lager, Nachlieferung totals),
  top-sellers table, vendor breakdown, recently-restocked events, part-number lookup.

SQL migrations:
- 034: blog-review-tag, 035: price-observations is_anomalous, 036: transceiver-equivalences.
2026-04-17 10:45:59 +02:00
Rene Fichtmueller
662cd1f90b fix(scraper): FiberMall URL schema + price parser + Flexoptix EUR comma bug
FiberMall:
- Correct /store-XXXXX-name.htm category URLs (was /c/xxx/ → HTTP 404)
- Parser: split on new_proList_mainListLi, price from data-price on
  currency_price span — fix 0.00 false-match from SKU variant items
- Also scrape SKU brand variant links from .sku_item divs
- Result: 3,410 prices now in DB (was 0)

Flexoptix:
- Fix extractPrice regex for EUR thousand-separator format
  (2,921.60 EUR was parsed as 2 EUR)
- Add OSFP224 / 1.6T search queries (4 new, form factor was missing)
- Fix O.138HG2.C.05 stale price 3009.60→2921.60 EUR

Schema: competitor_verified + competitor_verified_at columns
added via ALTER TABLE (were referenced in code but missing in DB)

CHANGELOG: added 6 entries for 2026-04-12
2026-04-12 04:26:35 +02:00
Rene Fichtmueller
cdb8ef6e61 feat(scraper): add FiberMall/Vcelink/OpticsBay scrapers, fix QSFPTEK API migration
- New scrapers: fibermall.ts (WooCommerce), vcelink.ts (Shopify), opticsbay.ts (WooCommerce)
- QSFPTEK rewritten to use /mall/commodity/list API (old OpenCart /c/*.html paths gone 404)
  - New: attribute-based filtering by data rate (1G/10G/25G/40G/100G/200G/400G/800G)
  - Scrapes HTML fragments, extracts US$ prices and product URLs
- scheduler.ts: +3 queues/schedules/workers (fibermall, vcelink, opticsbay) → 61 total workers
- index-pi.ts: Pi fleet picks up all 3 new scrapers
2026-04-11 19:13:36 +02:00
Rene Fichtmueller
148d2e1000 fix(scraper): set CRAWLEE_PURGE_ON_START=1 in withIsolatedStorage
Crawlee's SessionPool throws 'Could not find SDK_SESSION_POOL_STATE.json'
when initializing against a freshly-created isolated storage dir.
Setting CRAWLEE_PURGE_ON_START=1 tells Crawlee to start fresh instead
of trying to load non-existent session state — fixes FS.com and ATGBICS
crashes at the start of every 2h cycle after the dirs were cleaned up.
2026-04-11 07:27:24 +02:00
Rene Fichtmueller
45c48755e4 feat(scraper): add NADDOD/QSFPTEK/AddOn to scheduler, fix pre-existing TS build errors
- Register scrape:pricing:naddod (48 */2), qsfptek (52 */2), addon (55 */2) in pg-boss
- Add boss.work() handlers for all three (fetch-based, run on Erik)
- Fix findOrCreateScrapedTransceiver callers: remove invalid `name`/`url` params,
  fix `t.id` → `t` (function already returns string ID)
- Fix ebay-enricher: remove invalid `extractType` option, use extraction.standard_name
  instead of non-existent `.description`, fix cheerio type incompatibility
- Fix community-issues: description → summary, publishedDate → published_at
- Startup zombie cleanup already deployed (index.ts) — no changes needed
- ProLabs rewritten to fetch-based catalog scraper (no Playwright, bypasses WAF)
2026-04-11 03:17:33 +02:00
Rene Fichtmueller
6febb9c88e refactor(prolabs): replace Playwright+Firefox with fetch-based catalog scraper
ProLabs uses B2B quote model - prices require reseller account and are
not shown publicly (schema.org always shows price=0.00). Fighting
CloudFront WAF with Firefox automation is pointless.

New approach:
- Sitemap-driven: downloads all 14 sitemaps to collect product URLs
- fetch-based: curl-compatible HTTP requests bypass CloudFront TLS detection
- catalog-only: writes part numbers + specs to transceivers table
- Rate-limited: 300ms between requests (~3 req/sec)
- No proxy needed: Pi nodes no longer consumed for ProLabs
2026-04-11 02:57:13 +02:00
Rene Fichtmueller
7af5b32b3f ui: redesign LLM panel for light theme readability
Replace hard-coded purple/green colors with theme CSS variables.
Dark code blocks (#1e1e1e bg), orange accent for active borders/badges,
dark green for status text, amber for warnings — all readable on white.
2026-04-09 21:20:43 +02:00
Rene Fichtmueller
bf626f9de6 fix: route Pi-destined scrapers exclusively to Pi worker fleet
Remove boss.work() registrations for lightweight fetch/cheerio scrapers
from Erik's scheduler. Pis are now the SOLE consumers of these queues:
fluxlight, gbics, optcore, champion-one, sfpcables, blueoptics, fiber24,
tscom, skylane, ascentoptics, gaotek, smartoptics, hubersuhner, news,
market-intel.
2026-04-09 20:50:57 +02:00
Rene Fichtmueller
c898f52bbe feat: add LLM model selector panel to Blog Engine tab
Shows active model (fo-blog-v3-qwen7b / claude-sonnet-4-6 / qwen2.5:14b),
live status from /api/blog/llm/status, ratings, config instructions,
and highlights which model is currently active.
2026-04-09 20:42:03 +02:00
Rene Fichtmueller
cddc92c9d2 feat: TIP audit fixes — Qdrant init, switches columns, verification fix, crawler live status, demo data badges
- Migration 032: add system_type, is_linecard, chassis_model, slot_type, flexbox_* to switches table
- Migration 032: fix compute_transceiver_verification() to count seed data as details_verified (100% now)
- Migration 032: add is_demo_data flag to reorder_signals, abc_classification, market_intelligence, stock_snapshots
- Cisco 8000: insert 8812, 8818, 8800-LC-36FH, 8800-LC-48H with correct vendor slug 'cisco'
- API: add /api/scrapers/jobs endpoint exposing pg-boss job queue (active/recent/queues)
- Dashboard: live job queue panel in Crawler Intelligence tab (active jobs + recent 4h completions)
- Dashboard: DEMO DATA badge now uses is_demo_data column (was checking wrong field is_demo)
- Blog engine: configured fo-blog-v3-qwen7b fine-tuned model via tip-api ecosystem.config.js
- Qdrant: all 6 collections created, seeded (2135 products, 29 FAQs, 39 news, 20 troubleshooting)
2026-04-09 20:29:46 +02:00
Rene Fichtmueller
cf75eee8ad feat: linecard system support, Cisco 8000 accuracy, price anomaly detection
API/finder:
- Add modular chassis support: sibling linecards fetched when is_linecard=true
- Add chassis linecards when system_type=modular
- Extend switch response: system_type, is_linecard, chassis_model, slot_type,
  flexbox_compat_mode, flexbox_notes, description, switching_capacity_tbps,
  total_ports, category, lifecycle_status, features, use_cases, linecards[]

API/transceivers:
- Filter price_observations with COALESCE(is_anomalous, false) = false
  (direct prices + comparable market prices)

Scraper/db:
- Add PRICE_BOUNDS map (per form-factor min/max USD sanity bounds)
- Add isPriceAnomalous() — marks DB price_observations as is_anomalous=true
- Add competitor_verified flag: set true when valid competitor price stored
- upsertPriceObservation: skip prices outside sanity bounds, set competitor_verified

Scraper/hash:
- contentHash() now accepts Record<string,unknown> | string (union type)
  to support both structured objects and legacy string callers

Scrapers (skylane, tscom, wiitek):
- Fix contentHash() call signature: pass objects not JSON.stringify strings
- Fix wiitek: remove invalid 'name' param, fix t.id → transceiverId

Migrations:
- Add is_anomalous, competitor_verified, competitor_verified_at,
  image_primary columns
- Recreate sync_fully_verified trigger to include competitor_verified
- Add is_linecard, chassis_model, system_type, slot_type,
  flexbox_compat_mode, flexbox_notes to switches table
2026-04-09 09:06:22 +02:00
Rene Fichtmueller
240e7f46f2 feat(scraper): add SOCKS5 proxy rotation for fs-com, atgbics, gbics scrapers
Routes requests through CT130/131/132 proxy pool (192.168.178.77/76/74:1080)
when PROXY_URLS env var is set. Uses ProxyConfiguration from crawlee for
PlaywrightCrawler scrapers and socks-proxy-agent for fetch-based scrapers.
2026-04-08 08:17:49 +02:00
Rene Fichtmueller
51af249361 Merge remote-tracking branch 'github/main'
# Conflicts:
#	packages/api/src/llm/fo-blog-pipeline.ts
#	packages/api/src/routes/blog.ts
#	packages/scraper/src/scheduler.ts
#	packages/scraper/src/scrapers/fs-com.ts
#	packages/scraper/src/scrapers/gbics.ts
2026-04-06 18:03:36 +02:00
Rene Fichtmueller
bf06993b63 ui: show creation time (HH:MM) alongside date in blog list 2026-04-06 17:07:05 +02:00
Rene Fichtmueller
3928755c60 fix: correct verified badge, comparable pricing, and clickable product images
- Reset details_verified=false for 298 products where reach_label is empty (DB migration)
- Runtime check in dashboard: dVer requires non-empty reach_label regardless of DB flag
- comparable price query: treat reach_meters=0 same as NULL so 800G OSFP products
  find FS.com equivalent prices (was blocked by reach_meters=0 != NULL shortcircuit)
- Product image area now fully clickable with vendor link overlay when product_page_url exists
- Clear wrong image for O.Czz8HG.z.R (was showing unrelated OSFP product image)
2026-04-06 10:24:39 +02:00
Rene Fichtmueller
4acf293690 fix(llm): checkHealth uses key presence check, not live API call
Live Anthropic API call during health check causes 429 when the pipeline
is actively running, blocking all subsequent regenerate requests.
2026-04-06 04:07:21 +02:00