chore: changelog — Crawlee queue wipe, ATGBICS fix, Optcore skip
This commit is contained in:
parent
6b39bb0930
commit
e1e390231f
@ -12,6 +12,8 @@ Types: FEAT · FIX · UI · DATA · AI · INFRA
|
|||||||
{"d":"2026-04-18","t":"FIX","m":"Crawlee withIsolatedStorage global env-var race condition eliminated: scheduler.ts removed all withIsolatedStorage() wrappers (were mutating process.env.CRAWLEE_STORAGE_DIR globally, causing concurrent scrapers to pick up wrong storage dirs). All Crawlee scrapers now use makeCrawleeConfig(name) instance-level Configuration. fs-com.ts migrated to fs-phase1/fs-phase2 storage names with rmSync cleanup before each phase. switch-assets-crawler.ts and switch-assets-playwright.ts now pass makeCrawleeConfig. Fixed: ATGBICS, community-issues, market-intel, FS.com."}
|
{"d":"2026-04-18","t":"FIX","m":"Crawlee withIsolatedStorage global env-var race condition eliminated: scheduler.ts removed all withIsolatedStorage() wrappers (were mutating process.env.CRAWLEE_STORAGE_DIR globally, causing concurrent scrapers to pick up wrong storage dirs). All Crawlee scrapers now use makeCrawleeConfig(name) instance-level Configuration. fs-com.ts migrated to fs-phase1/fs-phase2 storage names with rmSync cleanup before each phase. switch-assets-crawler.ts and switch-assets-playwright.ts now pass makeCrawleeConfig. Fixed: ATGBICS, community-issues, market-intel, FS.com."}
|
||||||
{"d":"2026-04-18","t":"FIX","m":"FS.com Mac launchd scraper (org.tip.fs-scraper): was failing with exit code 126 (Operation not permitted) because macOS TCC blocks launchd agents from accessing ~/Desktop/Claude Code/ path. Fixed: script moved to ~/.tip/run-fs-scraper-mac.sh, plist WorkingDirectory changed to ~/.tip. FS.com is now scraping from residential Mac IP again."}
|
{"d":"2026-04-18","t":"FIX","m":"FS.com Mac launchd scraper (org.tip.fs-scraper): was failing with exit code 126 (Operation not permitted) because macOS TCC blocks launchd agents from accessing ~/Desktop/Claude Code/ path. Fixed: script moved to ~/.tip/run-fs-scraper-mac.sh, plist WorkingDirectory changed to ~/.tip. FS.com is now scraping from residential Mac IP again."}
|
||||||
{"d":"2026-04-18","t":"FIX","m":"NADDOD stockLevel 'unknown' -> 'on_request': invalid value for price_observations_stock_level_check constraint — was causing all NADDOD price insertions to fail."}
|
{"d":"2026-04-18","t":"FIX","m":"NADDOD stockLevel 'unknown' -> 'on_request': invalid value for price_observations_stock_level_check constraint — was causing all NADDOD price insertions to fail."}
|
||||||
|
{"d":"2026-04-18","t":"FIX","m":"Crawlee makeCrawleeConfig: clear request_queues/default before each run — Crawlee FileSystemStorage marks URLs as HANDLED (state=4, orderNo=null) after processing. With purgeOnStart=false these entries persisted, so next crawler.run(startUrls) deduplicated all startUrls → requestsTotal=0 → immediate finish with 0 scraped pages. Fix: rmSync(request_queues/default) at start of makeCrawleeConfig(). Safe: session pool lives in key_value_stores/, not request_queues/. ATGBICS confirmed fixed: now scrapes 6 categories, 78 products, 33 unique with prices."}
|
||||||
|
{"d":"2026-04-18","t":"FIX","m":"Optcore scraper: add SKIP_OPTCORE_SCRAPER guard — optcore.net Cloudflare WAF blocks Erik IP (82.165.222.127). WP REST API returns 403/HTML block page, catch handler returns 0 URLs → 0 products every run. Set SKIP_OPTCORE_SCRAPER=true in ecosystem.config.js. Pattern mirrors SKIP_FS_SCRAPER. Residential IP (Mac launchd) required for Optcore."}
|
||||||
{"d":"2026-04-18","t":"FIX","m":"10Gtek scraper: was finding 152 products but 0 prices — 10gtek.com main site only shows technical spec tables (no prices). Rewrote scraper to target sfpcables.com (10Gtek's own retail store, same company) which exposes Magento product listings with Model: <part> + US$X.XX prices. Added Magento loop-detection via seen-part dedup (stops pagination when all products on a page were already seen). XFP title-after-pipe fallback for part number extraction. Removed QSFP-DD (not on sfpcables.com). Result: 50 products, 49 prices on first live run. Health monitor CRITICAL alert resolved."}
|
{"d":"2026-04-18","t":"FIX","m":"10Gtek scraper: was finding 152 products but 0 prices — 10gtek.com main site only shows technical spec tables (no prices). Rewrote scraper to target sfpcables.com (10Gtek's own retail store, same company) which exposes Magento product listings with Model: <part> + US$X.XX prices. Added Magento loop-detection via seen-part dedup (stops pagination when all products on a page were already seen). XFP title-after-pipe fallback for part number extraction. Removed QSFP-DD (not on sfpcables.com). Result: 50 products, 49 prices on first live run. Health monitor CRITICAL alert resolved."}
|
||||||
{"d":"2026-04-18","t":"FEAT","m":"Price Comparison Dashboard: public /api/price-comparison (summary, list top-50 SKUs by vendor coverage, per-SKU detail). Express Router, no auth required. New '💲 Price Comparison' dashboard tab with stat cards, form-factor breakdown table, top-50 SKU table (clickable rows), and SKU detail lookup with per-vendor prices + stock + spread %."}
|
{"d":"2026-04-18","t":"FEAT","m":"Price Comparison Dashboard: public /api/price-comparison (summary, list top-50 SKUs by vendor coverage, per-SKU detail). Express Router, no auth required. New '💲 Price Comparison' dashboard tab with stat cards, form-factor breakdown table, top-50 SKU table (clickable rows), and SKU detail lookup with per-vendor prices + stock + spread %."}
|
||||||
{"d":"2026-04-18","t":"DATA","m":"Eoptolink OEM catalog scraper: harvests 93 product-solution pages from eoptolink.com, extracts part numbers (EOLO-*/EOLQ-* format), seeds transceivers table as manufacturer=Eoptolink entries with form_factor/speed/fiber/category. No prices (B2B OEM). Scheduled every 4h (40 */4 * * *)."}
|
{"d":"2026-04-18","t":"DATA","m":"Eoptolink OEM catalog scraper: harvests 93 product-solution pages from eoptolink.com, extracts part numbers (EOLO-*/EOLQ-* format), seeds transceivers table as manufacturer=Eoptolink entries with form_factor/speed/fiber/category. No prices (B2B OEM). Scheduled every 4h (40 */4 * * *)."}
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user