transceiver-db/sync/history/2026-05-09-near-complete-detail-queue-closure.md
2026-05-09 18:25:56 +02:00

68 lines
2.6 KiB
Markdown

# Near-Complete Detail Queue Closure
Date: 2026-05-09
Scope: TIP transceiver detail verification for rows already backed by price, image, and competitor evidence
## Goal
Close the remaining near-complete rows without manual approval and without launching heavy crawler/browser workloads on Erik.
## Implemented
- Added `packages/scraper/src/scrapers/atgbics-detail-pages.ts`
- lightweight Shopify `product.js` fetcher
- no browser, no Playwright
- strict parser for form factor, speed, reach, media, wavelength, connector, and product class
- Added `packages/scraper/src/scrapers/shopfiber24-fibermall-detail-pages.ts`
- lightweight static HTML fetcher
- FiberMall uses Schema.org Product JSON-LD
- ShopFiber24 uses static title/meta/description evidence
- Added package scripts:
- `scrape:atgbics:details`
- `scrape:vendors:details`
## Results
- ATGBICS:
- first product.js run: fetched `107`, updated `97`, skipped `10`, promoted `97`
- parser patch: `Max Distance_N/A` no longer blocks title/body distance evidence
- final product.js run: fetched `10`, updated `10`, skipped `0`, promoted `10`
- concurrent price-verification exposed another AOC batch; follow-up run fetched `23`, updated `23`, skipped `0`, promoted `23`
- near-complete missing details: `0`
- FiberMall + ShopFiber24:
- first detail run: fetched `116`, updated `112`, skipped `4`, promoted `112`
- final semantic closure: fetched `4`, updated `4`, skipped `0`, promoted `4`
- FiberMall near-complete missing details: `0`
- ShopFiber24 near-complete missing details: `0`
## Truth Rules
- Do not turn a variable AOC/DAC or category page into a fake fixed-distance transceiver.
- Use `Variant` reach for source-backed product families.
- Classify switches, media converters, muxes, and adapters as their actual product class.
- Classify 100G DWDM DCO as `Coherent DWDM` with line-system-dependent reach when no normal reach is stated.
- FiberMall source titles can repair brand-only part numbers when the source page provides a concrete MPN/product code.
## Final Live State
- `price_verified=11582`
- `details_verified=12276`
- `fully_verified=11001`
- near-complete queue:
- `price_verified=true`
- `image_verified=true`
- `competitor_verified=true`
- `details_verified=false`
- result: `0`
- Public health:
- status: `healthy`
- load status: `ok`
- memory used: `12%`
## Safety
- No external AI was used.
- No browser crawler was started.
- Erik SSH flapped several times; work paused between retries instead of hammering the host.
- All crawler/parser learnings were mirrored into the TIPLLM training pool.