FiberMall:
- Correct /store-XXXXX-name.htm category URLs (was /c/xxx/ → HTTP 404)
- Parser: split on new_proList_mainListLi, price from data-price on
currency_price span — fix 0.00 false-match from SKU variant items
- Also scrape SKU brand variant links from .sku_item divs
- Result: 3,410 prices now in DB (was 0)
Flexoptix:
- Fix extractPrice regex for EUR thousand-separator format
(2,921.60 EUR was parsed as 2 EUR)
- Add OSFP224 / 1.6T search queries (4 new, form factor was missing)
- Fix O.138HG2.C.05 stale price 3009.60→2921.60 EUR
Schema: competitor_verified + competitor_verified_at columns
added via ALTER TABLE (were referenced in code but missing in DB)
CHANGELOG: added 6 entries for 2026-04-12
- New scrapers: fibermall.ts (WooCommerce), vcelink.ts (Shopify), opticsbay.ts (WooCommerce)
- QSFPTEK rewritten to use /mall/commodity/list API (old OpenCart /c/*.html paths gone 404)
- New: attribute-based filtering by data rate (1G/10G/25G/40G/100G/200G/400G/800G)
- Scrapes HTML fragments, extracts US$ prices and product URLs
- scheduler.ts: +3 queues/schedules/workers (fibermall, vcelink, opticsbay) → 61 total workers
- index-pi.ts: Pi fleet picks up all 3 new scrapers
Crawlee's SessionPool throws 'Could not find SDK_SESSION_POOL_STATE.json'
when initializing against a freshly-created isolated storage dir.
Setting CRAWLEE_PURGE_ON_START=1 tells Crawlee to start fresh instead
of trying to load non-existent session state — fixes FS.com and ATGBICS
crashes at the start of every 2h cycle after the dirs were cleaned up.
ProLabs uses B2B quote model - prices require reseller account and are
not shown publicly (schema.org always shows price=0.00). Fighting
CloudFront WAF with Firefox automation is pointless.
New approach:
- Sitemap-driven: downloads all 14 sitemaps to collect product URLs
- fetch-based: curl-compatible HTTP requests bypass CloudFront TLS detection
- catalog-only: writes part numbers + specs to transceivers table
- Rate-limited: 300ms between requests (~3 req/sec)
- No proxy needed: Pi nodes no longer consumed for ProLabs
Replace hard-coded purple/green colors with theme CSS variables.
Dark code blocks (#1e1e1e bg), orange accent for active borders/badges,
dark green for status text, amber for warnings — all readable on white.
Remove boss.work() registrations for lightweight fetch/cheerio scrapers
from Erik's scheduler. Pis are now the SOLE consumers of these queues:
fluxlight, gbics, optcore, champion-one, sfpcables, blueoptics, fiber24,
tscom, skylane, ascentoptics, gaotek, smartoptics, hubersuhner, news,
market-intel.
Shows active model (fo-blog-v3-qwen7b / claude-sonnet-4-6 / qwen2.5:14b),
live status from /api/blog/llm/status, ratings, config instructions,
and highlights which model is currently active.
Routes requests through CT130/131/132 proxy pool (192.168.178.77/76/74:1080)
when PROXY_URLS env var is set. Uses ProxyConfiguration from crawlee for
PlaywrightCrawler scrapers and socks-proxy-agent for fetch-based scrapers.
- Reset details_verified=false for 298 products where reach_label is empty (DB migration)
- Runtime check in dashboard: dVer requires non-empty reach_label regardless of DB flag
- comparable price query: treat reach_meters=0 same as NULL so 800G OSFP products
find FS.com equivalent prices (was blocked by reach_meters=0 != NULL shortcircuit)
- Product image area now fully clickable with vendor link overlay when product_page_url exists
- Clear wrong image for O.Czz8HG.z.R (was showing unrelated OSFP product image)
Completes training data coverage for all 8 blog types:
market_alert(2), comparison(1), technology_deep_dive(4), tutorial(3),
hype_cycle(1), buying_guide(1), migration_guide(1), new_product(1),
competitor_analysis(1) — 15 gold-standard articles total
- blog-012: technology_deep_dive — coherent vs direct-detect decision framework
- blog-013: market_alert — transceiver price cycle, when to buy
Training set now covers: market_alert(2), comparison(1), technology_deep_dive(4),
tutorial(3), hype_cycle(1), buying_guide(1), migration_guide(1) — 13 total
Adding diverse topic coverage:
- blog-008: buying_guide — OEM vs compatible real cost numbers
- blog-009: migration_guide — 100G→400G what actually breaks
- blog-010: technology_deep_dive — QSFP-DD vs OSFP form factor reality
- blog-011: tutorial — transceiver procurement checklist
All follow FO rules: no markdown headers in body, no bullet lists,
one thesis, engineer voice, ~1000 words. Total training set: 11 articles.
The generateClaude() function was recursively calling itself inside
enqueueClaude(), creating a circular Promise dependency that permanently
deadlocked the claudeQueue. Any 429 rate-limit response would poison
the queue, blocking all future Claude API calls until server restart.
Fixes:
- Split retries into claudeApiCall() which is called from enqueueClaude
(not re-entering the queue on retry = no circular dependency)
- Max 3 retries with increasing backoff (10s/30s/60s)
- Add resetClaudeQueue() exported function
- Add 15-minute auto-reset stall detection to enqueueClaude
- Expose resetClaudeQueue in POST /api/blog/llm/reset-queue endpoint
- Fix merge conflict markers in index.ts (duplicate scraperRouter import)
026: Remove invalid price observations (sub-manufacturing-cost), disable
optictransceiver.com (domain repurposed as plant shop), fix verification
function to accept low/medium/high data_confidence values
027: Clean up FS.COM USD→EUR converted prices, force re-scrape with
new de.fs.com EUR-primary scraper
EUR prices scraped verbatim from de.fs.com — no conversion needed.
USD derivation (EUR→USD) happens downstream, not EUR←USD.
Fixes price discrepancy: TIP showed USD 999×0.92=EUR 866 vs real €948 on de.fs.com.
Root cause of fake prices (e.g. 1.30 for 800G OSFP):
- parsePrice accepted any bare number without currency symbol
- Could misread stock counts, page numbers, or CSS values as prices
- Also picked the first number, not the main price
Fix:
- Require explicit currency symbol or decimal format (1234.56)
- Use the LARGEST number found in the price string
- Returns price=0 (rejected) when no valid price pattern found
- blog/generate now uses caller title when provided; falls back to template
- Migration 027: hard price floor by speed class in verification function
(no medians, no estimates — only real prices above minimum thresholds)
- Deleted 474 obviously wrong price observations (shipping costs scraped as prices)
- Price column now shows price_verified_eur (in EUR, dimmed) when street_price_usd is null
Fixes: FS.COM products showing dash while being marked fully verified
- Badge logic now requires visible price AND image_verified AND details_verified
No more badge when price displays as dash — all requirements must be visually present
Tier-1 Anthropic API has 40K TPM — with ~20K tokens per pipeline step,
concurrent calls immediately hit the limit. enqueueClaude() serializes
all generateClaude() calls so only one runs at a time, eliminating
the flood of 429-retry-429-retry loops.