Rene Fichtmueller
ec40a96ae0
feat: add vendor detail verifiers
2026-05-09 18:22:09 +02:00
Rene Fichtmueller
91a1c2282a
fix: harden atgbics evidence parsing
2026-05-09 17:30:08 +02:00
Rene Fichtmueller
c2421c03a3
fix: harden shopfiber24 reach parsing
2026-05-09 17:24:06 +02:00
Rene Fichtmueller
bb9c495497
fix: verify qsfptek cable details
2026-05-09 17:03:35 +02:00
Rene Fichtmueller
fc18b00157
fix: verify copper cable semantics
2026-05-09 16:55:50 +02:00
Rene Fichtmueller
c25300199a
fix: harden atgbics wavelength semantics
2026-05-09 16:41:18 +02:00
Rene Fichtmueller
b26696f0d1
fix: improve vendor verification and fscom 1.6t variants
2026-05-09 15:56:08 +02:00
Rene Fichtmueller
60531b6250
feat: add crawlee python worker integration
2026-05-09 14:06:34 +02:00
Rene Fichtmueller
3d79f6b8e0
fix: add fscom url discovery mode
2026-05-09 14:00:30 +02:00
Rene Fichtmueller
f64dbf7b6b
fix: add fscom targeted detail verification mode
2026-05-09 11:15:36 +02:00
Rene Fichtmueller
549b4430df
fix: enrich flexoptix detail verification
2026-05-09 09:36:28 +02:00
Rene Fichtmueller
5522bb2152
fix: refresh price verification timestamps
2026-05-09 08:13:39 +02:00
Rene Fichtmueller
43b7250180
fix: automate equivalence research review queue
2026-05-09 07:48:11 +02:00
Rene Fichtmueller
ef225c7dc5
fix: revalidate flexoptix fs prices and images
2026-05-09 05:13:37 +02:00
Rene Fichtmueller
57e20efe49
fix: NADDOD price extraction — read from LD+JSON offers.price
...
NADDOD uses LD+JSON for pricing (Astro/Shopify structure):
{"offers":{"price":"731.00","priceCurrency":"USD",...}}
Old regex (/US$\s*.../) never matched → all 132 price obs were lucky
text matches, not systematic. Now: parse all ld+json blocks first,
fall back to regex.
Also broaden sitemap URL regex to capture new-style URLs without .html:
/products/nvidia-networking/102612 (was being missed)
2026-05-06 23:55:55 +02:00
Rene Fichtmueller
1a7c928120
fix: FS.COM price extraction — use .no_tax/.price CSS selectors
...
FS.com changed their HTML structure; compound class names are gone.
Current layout (verified 2026-05-06):
<div class="no_tax">5,10 € ohne MwSt.</div> ← B2B net price (preferred)
<div class="price">6,07 €</div> ← gross fallback
<div class="standard_price">6,07 €</div> ← gross fallback
Old selectors ([class*='price-value'] etc.) matched nothing → all prices
stored as €? null. New .no_tax first gives us the correct net/B2B price.
2026-05-06 23:45:30 +02:00
Rene Fichtmueller
a1a525b332
chore: sync API routes, dashboard hot-topics, MCP server, scraper package, scripts
2026-05-06 23:39:04 +02:00
Rene Fichtmueller
a8529d166b
fix: resolve TS build errors — export backfillImages, add writeRobotExperience
...
- backfill-images.ts: rename main() → export backfillImages() to match index.ts import
- training-data-writer.ts: add writeRobotExperience export; remove hardcoded Gitea token
- fiber24.ts/fibermall.ts: scraper improvements from previous sessions
- image-downloader.ts/spec-updater.ts: utility updates
- robots/: add verification robots module
2026-05-06 23:39:00 +02:00
Rene Fichtmueller
5a77fce9f3
feat: NADDOD cursor rotation — covers all 7300+ URLs across 12 runs (24h)
...
Previously always sliced first 600 URLs from sitemap, missing 6700+ products.
Now stores offset in naddod-cursor.json, advances by 600 per run with wrap-around.
Full sitemap coverage in ~13 runs (26h). Also adds TIP_STORAGE_DIR env support.
2026-05-06 23:26:58 +02:00
Rene Fichtmueller
efb0c24a19
feat: rewrite ATGBICS scraper to use Shopify products.json API
...
Static HTML collection pages return wrong results (all redirect to same 9 products).
Switch to /collections/{handle}/products.json?limit=250&page=N API which is:
- Reliable JSON (no HTML parsing)
- Correct per-collection product lists
- Clean pagination (stop at < limit results)
- Covers 11 key transceiver collections (1G, 10G, 25G, 40G, 100G, 400G)
2026-05-06 23:17:46 +02:00
Rene Fichtmueller
5c882c3a46
fix: refresh stale price observations after 7 days + fix ATGBICS pagination wrap-around
...
- upsertPriceObservation: insert new observation if last one is >7 days old,
even when price (content_hash) hasn't changed — keeps timeseries data fresh
- ATGBICS: detect Shopify catalog wrap-around by tracking per-category seen URLs;
stop pagination when all products on a page were already seen in a prior page
- ATGBICS: improve hasNextPage to match &page=N anchored in href params
2026-05-06 23:11:15 +02:00
Rene Fichtmueller
199f36be48
fix(scraper): auto-create pg-boss queues before scheduling + worker/schedule order
...
- scheduler: patch boss.schedule() to call createQueue() first (idempotent),
fixing FK constraint errors after DB reset — no need to touch 277 call sites
- index: registerWorkers() before registerSchedules() since boss.work() must
register handlers before schedules fire
- dashboard: fix switchBlogLlm() to use api() helper (adds Bearer auth token)
instead of raw fetch() which was returning 401 Unauthorized
2026-04-29 16:14:25 +02:00
Rene Fichtmueller
39a63e0401
fix(scheduler): vendor discovery crawlers daily 24/7 (not weekly)
2026-04-28 23:59:00 +02:00
Rene Fichtmueller
297dc46f2b
feat(crawler-llm): intelligent vendor discovery pipeline + TIPLLM training data
...
- spec-validator.ts: physical plausibility checks (form factor↔speed matrix,
wavelength↔fiber consistency, IEEE standard cross-check, reach limits).
Outputs tier (high/medium/low/rejected) + confidence_delta for LLM scores.
- training-data-writer.ts: converts validated crawler extractions to SFT JSONL
training pairs (spec_qa / crawl_reasoning / validation / discovery types).
Auto-commits and pushes to Gitea tip-training-data repo in batches of 50.
- vendor-discovery-crawler.ts: PlaywrightCrawler pipeline — catalog URL →
LLM extraction (scrapeWithLLM) → spec validation → DB persist +
Gitea SFT training pairs. 8 vendor configs registered
(Cisco/Juniper/Arista/FS.com/Flexoptix/Nokia/Huawei/II-VI).
- scheduler.ts: 8 weekly discover:vendor:* jobs added (Sun 20:00–Mon 10:00 UTC).
Total registered jobs: 102.
- Gitea repo created: gitea.context-x.org/rene/tip-training-data
2026-04-28 23:46:34 +02:00
Rene Fichtmueller
2466cc5d82
feat(scraper): batch 37 OEM seeds — Extreme (Legacy), Nortel, 3Com, Avaya
...
Added 4 legacy OEM transceiver catalog seed scrapers (72 PIDs total):
- extreme-legacy-oem.ts: 18 PIDs — Summit/BlackDiamond 10052H/10318/10325 family, Legacy
- nortel-legacy-oem.ts: 18 PIDs — Passport/BayStack AA1419xxx + XFP, incl. GBIC, Legacy
- 3com-legacy-oem.ts: 18 PIDs — Switch 5500/7750 3C17770/3CSFP9x + XFP/GBIC, Legacy
- avaya-legacy-oem.ts: 18 PIDs — ERS/VSP AA1419xxx + 700480xxx QSFP28, Legacy
Scheduler: wired at 05:30/05:45/06:00/06:15 UTC. All 72 PIDs seeded clean.
2026-04-28 23:31:13 +02:00
Rene Fichtmueller
e684d3d1c3
feat(scraper): batch 36 OEM seeds — EnGenius, Palo Alto Networks, Brocade, Foundry Networks
...
Added 4 new OEM transceiver catalog seed scrapers (72 PIDs total):
- engenius-oem.ts: 18 PIDs — ECS switch series 1G–100G SFP/SFP+/SFP28/QSFP28 + DAC/AOC
- paloalto-networks-oem.ts: 18 PIDs — PA-3200/5200/7000/5450 NGFW SFP/SFP+/SFP28/QSFP28 + DAC
- brocade-legacy-oem.ts: 18 PIDs — ICX/FCX/VDX/MLX E1MG/10G-SFPP family, market_status=Legacy
- foundry-networks-oem.ts: 18 PIDs — FastIron/NetIron FDR- series incl. XFP, market_status=Legacy
Scheduler: wired at 04:30/04:45/05:00/05:15 UTC. All 72 PIDs seeded clean.
2026-04-28 23:28:10 +02:00
Rene Fichtmueller
e9b8cb95db
feat(scraper): batch 35 OEM seeds — Sierra Wireless, Senao, EMCORE, Reflex Photonics
...
Added 4 new OEM transceiver catalog seed scrapers (75 PIDs total):
- sierra-wireless-oem.ts: 18 PIDs — RV55/RV50X/LX60 SFP/SFP+/QSFP+ incl. Industrial -40~85°C
- senao-oem.ts: 20 PIDs — EnGenius ECS switches 1G–100G SFP/SFP+/SFP28/QSFP28 + DAC
- emcore-oem.ts: 20 PIDs — ORION coherent ZR/ZR+/CFP2-DCO 400G + MIL-grade avionics
- reflex-photonics-oem.ts: 17 PIDs — LightABLE MIL-STD-810H + RAD-HARD space-grade
Scheduler: wired at 03:30/03:45/04:00/04:15 UTC. All 75 PIDs seeded to TIP DB.
2026-04-28 23:24:53 +02:00
Rene Fichtmueller
32d3ded169
feat: add Finisar, Acacia, Inphi OEM scrapers (batch 34)
...
- finisar-oem: 17 PIDs (FTLX/FTLC historical BoM series, 1G-100G, widely referenced)
- acacia-oem: 14 PIDs (AC400/AC1200 coherent CFP2-DCO/QSFP-DD/OSFP up to 1.2T)
- inphi-oem: 13 PIDs (ColorZ/COLORZ-II DWDM QSFP28/QSFP-DD + 800G OSFP)
- scheduler: wired all 3 at 02:45/03:00/03:15 UTC
2026-04-28 23:14:06 +02:00
Rene Fichtmueller
1023b24fd0
feat: add Black Box, Radiflow, DragonWave, Teledyne LeCroy OEM scrapers (batch 33)
...
- black-box-oem: 19 PIDs (enterprise LAN SFP/SFP+/SFP28/QSFP28 + BiDi + DAC)
- radiflow-oem: 17 PIDs (OT/ICS security, 100M-100G incl. substation BiDi, category=Industrial)
- dragonwave-oem: 17 PIDs (microwave backhaul fiber uplinks 100M-100G, market_status=Legacy)
- teledyne-lecroy-oem: 18 PIDs (T&M oscilloscopes/analyzers SFP+-QSFP-DD up to 400G ZR)
- scheduler: wired all 4 at 01:45/02:00/02:15/02:30 UTC
2026-04-28 23:07:26 +02:00
Rene Fichtmueller
7f59f445b6
feat: add Cambium Networks, Tektronix, Clearfield, Lanner OEM scrapers (batch 32)
...
- cambium-networks-oem: 18 PIDs (cnMatrix/PTP820 1G-100G + BiDi + DAC)
- tektronix-oem: 19 PIDs (T&M SFP/SFP+/SFP28/QSFP28/QSFP-DD up to 400G ZR coherent)
- clearfield-oem: 16 PIDs (FTTP/FTTx GPON/XGS-PON OLT+ONT + 1G-100G backhaul, heavy Telecom)
- lanner-oem: 20 PIDs (NFVI/uCPE 1G-100G + BiDi + DAC stack)
- scheduler: wired all 4 at 00:45/01:00/01:15/01:30 UTC
2026-04-28 23:04:08 +02:00
Rene Fichtmueller
22788db26b
feat: add Rohde & Schwarz, L3Harris, Zhone OEM scrapers (batch 31)
...
- rohde-schwarz-oem: 19 PIDs (T&M optical modules, SFP/SFP+/SFP28/QSFP28/QSFP-DD up to 400G ZR coherent)
- l3harris-oem: 18 PIDs (MIL-grade ruggedized SFP/SFP+/SFP28/QSFP+/QSFP28, category=Industrial)
- zhone-oem: 18 PIDs (GPON/XGS-PON/EPON OLT+ONT plus 1G-100G uplinks, heavy Telecom set)
- scheduler: wired all 3 at 00:00/00:15/00:30 UTC with workers
2026-04-28 22:57:23 +02:00
Rene Fichtmueller
ab6888fec8
feat: add OEM seed scrapers batch 29-30 (8 vendors, 147 PIDs)
...
Adds scrapers for:
- AudioCodes (12 PIDs) — SBC/media gateway transceivers
- Anritsu (19 PIDs) — T&M platform optical modules
- NETSCOUT (19 PIDs) — nGenius probe + InfiniStream optics
- Curtiss-Wright (19 PIDs) — MIL-grade ruggedized transceivers
- ECI Telecom (18 PIDs) — DWDM/OTN/SONET carrier platform
- UTStarcom (17 PIDs) — GPON/XGS-PON/EPON broadband access
- Turbolink (23 PIDs) — Taiwanese OEM transceiver manufacturer
- Chelsio (20 PIDs) — iWARP RDMA NIC optical modules
Scheduler: 8 new cron slots 22:00-23:45 UTC daily.
DB: 12,937 → 13,084 transceivers, 181 → 189 vendors.
2026-04-27 00:44:18 +02:00
Rene Fichtmueller
d7144731e0
feat(scraper): add 100+ OEM seed scrapers + tip-llm-guided inference layer
...
New OEM transceiver seed scrapers (94 cron-scheduled, 24/7):
- Media/Broadcast: Evertz, Grass Valley, Haivision, Viasat
- Asian Optical: FiberHome, Oplink, Accelink, Hisense Broadband
- Optical Mfrs: Lumentum, II-VI/Coherent, Source Photonics, O-Net,
InnoLight, AOI, Sumitomo Electric, NeoPhotonics
- Industrial: GE Grid, Schweitzer, Moxa Industrial, Cisco IE,
Phoenix Contact, Beckhoff, Omron, ABB, Siemens, Schneider, Rockwell, Belden
- Enterprise/DC: Arista, Pica8, Pluribus, DriveNets, Cisco (Meraki/Catalyst/Nexus/ASR)
- Cloud: AWS, Azure, Google Cloud, Meta
- Storage: NetApp, Pure Storage, HPE Storage, IBM Storage, Dell Storage, Hitachi Vantara
- 5G/RAN: Samsung Networks, Nokia AirScale, Ericsson RAN, Mavenir
- Security: Check Point, Barracuda, Fortinet, Palo Alto
- Telecom Optical: ADVA, PacketLight, FiberHome, Accelink, Hisense
API: tip-llm-guided inference layer (strict schema + repair-retry + safe fallback)
- POST /api/tip-llm/infer|research-plan|extract|finding|health
- Hard JSON schema enforcement, create_finding=false on empty evidence
- Confidence gate (>= 0.4), validation with consistency check
Build: added incremental=true to scraper tsconfig (OOM prevention)
Scheduler: 87 → 94 registered workers
2026-04-27 00:00:14 +02:00
Rene Fichtmueller
4479429b29
feat: Brocade/RUCKUS ICX OEM seed (E1MG/E10G/E40G/E100G series, 29 PIDs)
2026-04-26 20:15:54 +02:00
Rene Fichtmueller
ad295a2b4b
feat: NVIDIA/Mellanox OEM seed (Spectrum + LinkX portfolio)
...
36 PIDs: MFM/MMA/MCP LinkX series covering SFP/SFP+/SFP28/QSFP+/
QSFP28/QSFP56/QSFP-DD/OSFP + LinkX DAC 25G-400G. Includes
SN2000-SN5600 Spectrum switch transceivers. Scheduler: 05:45 daily.
2026-04-26 19:20:12 +02:00
Rene Fichtmueller
b3a2eff776
feat: Dell EMC + Extreme Networks OEM transceiver seeds
...
Dell EMC (34 PIDs): PowerSwitch OS10 + legacy Force10 naming,
1G-400G QSFP-DD + DAC. Maps to Dell Technologies vendor.
Extreme Networks (33 PIDs): Summit/ExtremeSwitching 10xxH part
numbers, 1G-400G QSFP-DD + DAC. Scheduler: 05:15 + 05:30 daily.
2026-04-26 19:18:31 +02:00
Rene Fichtmueller
1c1b8e1e9d
feat: Huawei OEM transceiver seed + scheduler
...
Add 43 Huawei OEM PIDs covering CloudEngine/NetEngine platform:
SFP-GE/SFP+-10G/SFP28-25G/QSFP+-40G/QSFP28-100G/QSFP56-200G/
QSFP-DD-400G/OSFP-800G + DAC. Includes BOM alias codes.
Scheduler: daily 05:00.
2026-04-26 19:16:16 +02:00
Rene Fichtmueller
b4c8b9b625
feat: Nokia/Alcatel-Lucent OEM seed + scheduler
...
Add 41 Nokia OEM transceiver PIDs: SFP-1G/SFP-10G/SFP-25G/QSFP+-40G/
QSFP28-100G/QSFPDD-200G+400G + DWDM ZR/ZR+ + DAC. Includes legacy
3HExxxxxxxx alternate part numbers in notes field.
Scheduler: daily 04:45.
2026-04-26 19:14:27 +02:00
Rene Fichtmueller
51cee266f5
feat: HPE/Aruba OEM seed + Cisco TMG upsert fix
...
Add 43 HPE/Aruba OEM transceiver PIDs (J/JL/JH/R series — 1G through
400G QSFP-DD + DAC/AOC). Scheduler: daily 04:30.
Cisco TMG scraper: fixed market_status/temp_range constraint violations,
switched to always-upsert pattern. Result: 423 switches, 22476 Cisco
OEM transceivers, 22476 compat entries written to DB.
Update CHANGELOG_PENDING with all session data changes.
2026-04-26 19:12:27 +02:00
Rene Fichtmueller
c9a50ad551
feat: Juniper OEM seed scraper + BlueOptics HTTP/1.1 fix
...
Add 59 Juniper OEM transceiver PIDs (SFP/SFP+/SFP28/QSFP+/QSFP28/
QSFP56/QSFP-DD/OSFP + DAC/AOC) to seed the transceivers table.
Register scrape:catalog:juniper-oem in scheduler (daily 04:15).
Fix BlueOptics scraper: force HTTP/1.1 via Node.js https.get() to
bypass server bug where HTTP/2 returns empty response body. Also
update catalog path from /transceivers/ to /Transceivers_1.
2026-04-26 19:08:09 +02:00
Rene Fichtmueller
cc85d3d0f8
feat: Cisco OEM + Arista OEM transceiver catalog scrapers
...
- cisco-tmg.ts: upsert Cisco OEM transceivers from TMG API instead of
SELECT-only. Parsers for formFactor/speed/reach/fiberType/tempRange.
Fixes market_status ('EOL') + temp_range ('COM'/'IND') check constraints.
- arista-oem.ts: seed scraper for 69 Arista OEM PIDs (1G→800G,
SFP/SFP28/QSFP+/QSFP28/QSFP-DD/OSFP/QSFP-DD800) with full specs.
- scheduler.ts: daily arista-oem seed at 04:00 UTC
2026-04-26 19:00:21 +02:00
Rene Fichtmueller
ba998f4c01
fix: vendor_compat 0%→100%, price denorm, wiitek disabled, price-denorm scheduler
...
- Migration 094: images for 12 Cisco 8K MPA + A9K-8HG-FLEX + ASR-9000V models
- Migration 095: price denorm refresh (EUR 679→1376, USD 166→835 with 180d window)
- Migration 096: bulk vendor_compat by form_factor — all 9013 transceivers now
have OEM compatibility patterns (was 0/9013 because all slugs are scraped-*)
- wiitek.ts: disable dead scraper (wiitek.com unreachable since 2026-04, EAI_AGAIN)
- scheduler.ts: add compute:price-denorm job (daily 05:30 UTC) to keep
street_price_usd/price_verified_eur fresh without manual migration runs
- seed-from-npm.ts: ON CONFLICT now also updates vendor_compat (was only updated_at)
2026-04-25 08:55:21 +02:00
Rene Fichtmueller
bbc6f560dd
fix: add image filter patterns and direct URL migrations for 6 vendors
...
- switch-image-playwright.ts + switch-image-fetcher.ts: add filter patterns
for /webimage-404/ (Netgear 404 hero), /Brand/ + /cybersecurity.png/
(Moxa brand marketing images not product photos)
- sql/047: Moxa 4/4 models — CDN getattachment paths (hotlink-protected,
Referer: moxa.com required; R2 proxy needed for production display)
- sql/048: UfiSpace 6/6 models — ufispace.com/image/<hash>/ direct PNGs;
Brocade G720+G730 — broadcom.com og:image; ICX 7850-48FS — CommScope/Ruckus
vistancenetworks.com ImageServer (rand param is cache-bust only, not auth)
- sql/049: NVIDIA SN-series 6/6 — docscontent.nvidia.com (SN2201/3700/4700)
and S3 direct (SN5400/5600); SN3750-SX via uvation reseller CDN
2026-04-21 07:57:55 +02:00
Rene Fichtmueller
b65e4452db
fix: add error-graphic, icon-library, illustration filters to GENERIC_IMAGE_PATTERNS
...
- /404[-_]error/i, /error[-_]graphic/i — Broadcom 404-ERROR-GRAPHIC.png
- /\/icon[-_]library\//i — D-Link navigation/icon-library path images
- /[-_]illustration[._]/i — Arista Cloud-Legacy_Illustration and similar diagrams
- Nokia banner, Huawei marketing, banners/ path patterns (Playwright scraper)
- Cookie consent patterns synced to switch-image-fetcher.ts (was only in Playwright)
2026-04-21 07:38:01 +02:00
Rene Fichtmueller
f4afe14af4
feat: add 12 new vendor URL builders to Playwright image scraper
...
- Nokia, Huawei, Ciena, Moxa, D-Link, Alcatel-Lucent Enterprise,
Asterfusion, Brocade: passthrough builders (use stored product_page_url)
- NVIDIA Networking: SN-series URL builder (sn5600 → /ethernet-switching/sn5600/)
- Netgear: lowercase model slug builder for /business/wired/switches/fully-managed/
- UfiSpace: hardcoded sitemap-verified URL map (all 6 S9xxx models)
- QCT: hardcoded URL map for T3048-LY8 and T7032-IX1
- Add Nokia banner / Huawei marketing image patterns to GENERIC_IMAGE_PATTERNS
2026-04-21 07:24:11 +02:00
Rene Fichtmueller
8f36eff956
fix(scraper): filter OneTrust/cookie-consent images + skip in img fallback
...
cdn.cookielaw.org logos appear as the largest DOM image on Dell/Extreme
product pages when the cookie consent overlay is present. Added to both
GENERIC_IMAGE_PATTERNS (isGenericImage filter) and img fallback skipPattern
so the next-largest actual product image can be found.
2026-04-21 06:45:41 +02:00
Rene Fichtmueller
d67fbe31da
fix(scraper): fall through to img fallback when og:image is generic/logo
...
Previously: if og:image existed (even as a Dell logo URL), page.evaluate() returned
early and the img fallback was never tried. Now: meta tags are extracted first, then
isGenericImage() is checked in Node.js, and the img fallback runs if meta image is null
or generic. This allows vendors like Dell (og:image = logo) to still get product images
via the DOM fallback.
2026-04-21 06:36:12 +02:00
Rene Fichtmueller
09d3a60b7c
fix(scraper): fix Edgecore/Extreme URL builders, broaden img fallback, fix ENOENT
...
- buildEdgecoreUrl: /product/<slug>/ (WooCommerce, no .html) with EDGECORE_SLUG_MAP
for AS7712-32X→as7712-32x-ec, Minipack2→minipack-as8000-open-modular-platform
- buildFortinetUrl: returns null (all pages redirect to generic, no usable og:image)
- buildExtremeUrl: direct product URL (extremenetworks.com/product/<slug>)
- img fallback: remove strict 'product/switch/router/hardware' path requirement;
now takes largest image >=200x150px excluding flags/icons/spinners — isGenericImage()
filters hero/banner/logo afterward
- ENOENT fix: unique per-run Crawlee storage dir (timestamp suffix) prevents
stale request-queue file contamination between back-to-back vendor runs
2026-04-21 06:33:32 +02:00
Rene Fichtmueller
87b9416592
fix(scraper): fix Arista series-level URL builder + bypass Crawlee URL deduplication
...
- buildAristaUrl() now extracts series prefix (7060X5-32QS → 7060x5-series)
instead of individual model URLs that lack og:image
- Strip trailing sub-variant 'A' so R3A → R3 series page
- Add uniqueKey: row.id to each request — prevents Crawlee from deduplicating
models that share the same series URL (e.g. 7060x5-series)
- For Arista: always prefer fresh builder URL over stored product_page_url
so stale individual-model URLs don't override correct series pages
2026-04-21 06:22:41 +02:00
Rene Fichtmueller
18a9e1346e
feat: Playwright image scraper for bot-blocked vendors (Arista/Dell/Edgecore/Fortinet/Extreme)
2026-04-21 06:16:05 +02:00