Rene Fichtmueller
|
cf75eee8ad
|
feat: linecard system support, Cisco 8000 accuracy, price anomaly detection
API/finder:
- Add modular chassis support: sibling linecards fetched when is_linecard=true
- Add chassis linecards when system_type=modular
- Extend switch response: system_type, is_linecard, chassis_model, slot_type,
flexbox_compat_mode, flexbox_notes, description, switching_capacity_tbps,
total_ports, category, lifecycle_status, features, use_cases, linecards[]
API/transceivers:
- Filter price_observations with COALESCE(is_anomalous, false) = false
(direct prices + comparable market prices)
Scraper/db:
- Add PRICE_BOUNDS map (per form-factor min/max USD sanity bounds)
- Add isPriceAnomalous() — marks DB price_observations as is_anomalous=true
- Add competitor_verified flag: set true when valid competitor price stored
- upsertPriceObservation: skip prices outside sanity bounds, set competitor_verified
Scraper/hash:
- contentHash() now accepts Record<string,unknown> | string (union type)
to support both structured objects and legacy string callers
Scrapers (skylane, tscom, wiitek):
- Fix contentHash() call signature: pass objects not JSON.stringify strings
- Fix wiitek: remove invalid 'name' param, fix t.id → transceiverId
Migrations:
- Add is_anomalous, competitor_verified, competitor_verified_at,
image_primary columns
- Recreate sync_fully_verified trigger to include competitor_verified
- Add is_linecard, chassis_model, system_type, slot_type,
flexbox_compat_mode, flexbox_notes to switches table
|
2026-04-09 09:06:22 +02:00 |
|
Rene Fichtmueller
|
e9fb50a248
|
feat: TIP Phase 0+1 — monorepo, DB schema, API, scraper engine
Phase 0 - Foundation:
- Restructure into npm workspace monorepo (packages/core, api, scraper)
- PostgreSQL 17 + TimescaleDB schema (15 tables incl. hypertables)
- Docker Compose for local dev (PostgreSQL on 5433 + Qdrant)
- Express 5 API on port 3200 with 6 routes
- Seed script to migrate 159 transceivers + 42 standards from npm package
- Erik server setup script + PM2 ecosystem config
Phase 1 - Scraper Engine:
- Crawlee + Playwright framework with pg-boss scheduler
- FS.com scraper (PlaywrightCrawler, anti-bot workaround)
- Optcore.net scraper (WP REST API enumeration + PlaywrightCrawler)
- Uses /wp-json/wp/v2/product to get 2000+ product URLs
- Playwright renders individual product pages for price extraction
- Cisco TMG Matrix scraper (compatibility data)
- News RSS aggregator (optics.org, SPIE, Network World, Nature Photonics)
- Keyword relevance scoring for transceiver/fiber topics
- xml2js with malformed XML sanitization
- SHA-256 content hashing for change detection (skip unchanged records)
- pg-boss v10 with explicit queue creation before scheduling
|
2026-03-27 16:27:31 +13:00 |
|