Rene Fichtmueller
|
ba998f4c01
|
fix: vendor_compat 0%→100%, price denorm, wiitek disabled, price-denorm scheduler
- Migration 094: images for 12 Cisco 8K MPA + A9K-8HG-FLEX + ASR-9000V models
- Migration 095: price denorm refresh (EUR 679→1376, USD 166→835 with 180d window)
- Migration 096: bulk vendor_compat by form_factor — all 9013 transceivers now
have OEM compatibility patterns (was 0/9013 because all slugs are scraped-*)
- wiitek.ts: disable dead scraper (wiitek.com unreachable since 2026-04, EAI_AGAIN)
- scheduler.ts: add compute:price-denorm job (daily 05:30 UTC) to keep
street_price_usd/price_verified_eur fresh without manual migration runs
- seed-from-npm.ts: ON CONFLICT now also updates vendor_compat (was only updated_at)
|
2026-04-25 08:55:21 +02:00 |
|
Rene Fichtmueller
|
e9fb50a248
|
feat: TIP Phase 0+1 — monorepo, DB schema, API, scraper engine
Phase 0 - Foundation:
- Restructure into npm workspace monorepo (packages/core, api, scraper)
- PostgreSQL 17 + TimescaleDB schema (15 tables incl. hypertables)
- Docker Compose for local dev (PostgreSQL on 5433 + Qdrant)
- Express 5 API on port 3200 with 6 routes
- Seed script to migrate 159 transceivers + 42 standards from npm package
- Erik server setup script + PM2 ecosystem config
Phase 1 - Scraper Engine:
- Crawlee + Playwright framework with pg-boss scheduler
- FS.com scraper (PlaywrightCrawler, anti-bot workaround)
- Optcore.net scraper (WP REST API enumeration + PlaywrightCrawler)
- Uses /wp-json/wp/v2/product to get 2000+ product URLs
- Playwright renders individual product pages for price extraction
- Cisco TMG Matrix scraper (compatibility data)
- News RSS aggregator (optics.org, SPIE, Network World, Nature Photonics)
- Keyword relevance scoring for transceiver/fiber topics
- xml2js with malformed XML sanitization
- SHA-256 content hashing for change detection (skip unchanged records)
- pg-boss v10 with explicit queue creation before scheduling
|
2026-03-27 16:27:31 +13:00 |
|