Rene Fichtmueller
cf0e471fa4
feat: close TIP research resolution states
2026-05-10 10:13:09 +02:00
Rene Fichtmueller
d691745c7b
feat: clean TIP cable rows from active base
2026-05-10 09:41:59 +02:00
Rene Fichtmueller
2be61f2441
feat: close TIP retail price research states
2026-05-10 01:42:24 +02:00
Rene Fichtmueller
b58f7cee41
feat: resolve OEM price status and part details
2026-05-10 01:16:49 +02:00
Rene Fichtmueller
adb2661fac
feat: add targeted product page asset verifier
2026-05-10 00:31:33 +02:00
Rene Fichtmueller
635a102932
feat: close open competitor research states
2026-05-10 00:03:42 +02:00
Rene Fichtmueller
fb9db56617
fix: quarantine fs numeric sku aliases
2026-05-09 23:35:01 +02:00
Rene Fichtmueller
79a57a5ac6
feat: add no-valid competitor resolver
2026-05-09 23:16:04 +02:00
Rene Fichtmueller
1af4f090f7
fix: harden TIP verification cleanup
2026-05-09 22:16:29 +02:00
Rene Fichtmueller
a43e572946
fix: advance TIP product verification robots
2026-05-09 20:19:19 +02:00
Rene Fichtmueller
ec40a96ae0
feat: add vendor detail verifiers
2026-05-09 18:22:09 +02:00
Rene Fichtmueller
60531b6250
feat: add crawlee python worker integration
2026-05-09 14:06:34 +02:00
Rene Fichtmueller
a1a525b332
chore: sync API routes, dashboard hot-topics, MCP server, scraper package, scripts
2026-05-06 23:39:04 +02:00
Rene Fichtmueller
240e7f46f2
feat(scraper): add SOCKS5 proxy rotation for fs-com, atgbics, gbics scrapers
...
Routes requests through CT130/131/132 proxy pool (192.168.178.77/76/74:1080)
when PROXY_URLS env var is set. Uses ProxyConfiguration from crawlee for
PlaywrightCrawler scrapers and socks-proxy-agent for fetch-based scrapers.
2026-04-08 08:17:49 +02:00
Rene Fichtmueller
e9fb50a248
feat: TIP Phase 0+1 — monorepo, DB schema, API, scraper engine
...
Phase 0 - Foundation:
- Restructure into npm workspace monorepo (packages/core, api, scraper)
- PostgreSQL 17 + TimescaleDB schema (15 tables incl. hypertables)
- Docker Compose for local dev (PostgreSQL on 5433 + Qdrant)
- Express 5 API on port 3200 with 6 routes
- Seed script to migrate 159 transceivers + 42 standards from npm package
- Erik server setup script + PM2 ecosystem config
Phase 1 - Scraper Engine:
- Crawlee + Playwright framework with pg-boss scheduler
- FS.com scraper (PlaywrightCrawler, anti-bot workaround)
- Optcore.net scraper (WP REST API enumeration + PlaywrightCrawler)
- Uses /wp-json/wp/v2/product to get 2000+ product URLs
- Playwright renders individual product pages for price extraction
- Cisco TMG Matrix scraper (compatibility data)
- News RSS aggregator (optics.org, SPIE, Network World, Nature Photonics)
- Keyword relevance scoring for transceiver/fiber topics
- xml2js with malformed XML sanitization
- SHA-256 content hashing for change detection (skip unchanged records)
- pg-boss v10 with explicit queue creation before scheduling
2026-03-27 16:27:31 +13:00