Rene Fichtmueller
2be61f2441
feat: close TIP retail price research states
2026-05-10 01:42:24 +02:00
Rene Fichtmueller
c2421c03a3
fix: harden shopfiber24 reach parsing
2026-05-09 17:24:06 +02:00
Rene Fichtmueller
a8529d166b
fix: resolve TS build errors — export backfillImages, add writeRobotExperience
...
- backfill-images.ts: rename main() → export backfillImages() to match index.ts import
- training-data-writer.ts: add writeRobotExperience export; remove hardcoded Gitea token
- fiber24.ts/fibermall.ts: scraper improvements from previous sessions
- image-downloader.ts/spec-updater.ts: utility updates
- robots/: add verification robots module
2026-05-06 23:39:00 +02:00
Rene Fichtmueller
785a6731ab
fix: fiber24 stockLevel on_request (was unknown — violated DB constraint)
2026-04-18 22:26:45 +02:00
Rene Fichtmueller
d4ad9f4641
fix: ShopFiber24 sitemap-based scraping + Fibermall image extraction
...
ShopFiber24 (fiber24.ts):
- Complete rewrite: was using JS-rendered catalog (all prices = 0)
- New strategy: fetch sitemap_0.xml.gz → 310 product DE-URLs
- Each product page has Schema.org microdata: itemprop=price, sku, image
- Extracts: price (minPrice), SKU, image_url, name, specs
- Rate: 1 req/1.5s, no Playwright needed
FiberMall (fibermall.ts):
- Add imageUrl to Product interface
- Extract first fibermall.com/photo/*.jpg from product listing card
- Write image_url to transceivers table (has_image=true) on upsert
- SKU variants share parent product image
- 304 FiberMall transceivers will get images on next scraper run
2026-04-18 22:20:57 +02:00
Rene Fichtmueller
2e852e0a2f
fix(scrapers): replace bot User-Agents with Chrome UA + disable dead domain
...
- 16 commercial scrapers: replace TIP-Bot/1.0 with Chrome/120 UA
(GBICS confirmed returning 0 bytes for bot UA, Chrome UA returns 200KB)
- gbics.ts: fix User-Agent (was returning empty HTML, now returns products)
- optictransceiver.ts: disable — domain repurposed as plant shop (2026-04-06)
Alocasia Regal Shield is not a transceiver.
2026-04-06 02:17:50 +02:00
Rene Fichtmueller
072978f1a4
feat: 24/7 scraping fleet — 8 new vendors + continuous schedule + Pi setup
...
New scrapers (8):
- BlueOptics (EUR, every 4h)
- ShopFiber24 (EUR, every 4h)
- T&S Communication (USD, every 4h)
- SmartOptics (catalog, every 8h)
- HUBER+SUHNER (catalog, every 8h)
- Skylane Optics (USD, every 4h)
- AscentOptics (USD, every 4h)
- GAO Tek (USD, every 4h)
Scheduler: nightly window → 24/7 continuous (42 jobs total)
- Playwright scrapers: every 8h (FS.com, 10Gtek, ATGBICS, ProLabs)
- Fetch/Cheerio: every 4h (11 lightweight vendors)
- Flexoptix catalog: every 2h (primary price source)
- eBay enrichment: every 6h
- Compatibility matrices: every 12h
- Compute jobs: every 4h
Pi fleet: scripts/pi-scraper-setup.sh for one-command Pi node setup
2026-04-02 01:09:05 +02:00