transceiver-db/sync/history/2026-05-09-tip-fs-sku-alias-quarantine.md
2026-05-09 23:35:01 +02:00

77 lines
2.1 KiB
Markdown

# 2026-05-09 — TIP FS.com Numeric SKU Alias Quarantine
## Problem
FS.com pages expose two identifiers:
- marketplace SKU, for example `FS-380881`
- real optical product P/N, for example `OSFP-DR8-1.6T-FL`
Older scraper passes created active duplicate rows for both. This polluted equivalence research because numeric SKU rows looked like separate transceivers.
The user-reported 1.6T case confirmed the issue:
- `FS-380881` is the numeric SKU alias for `OSFP-DR8-1.6T-FL` (500m)
- `FS-380883` is the numeric SKU alias for `OSFP-2FR4-1.6T-FL` (2km)
- both real product P/N rows must remain in TIP
- the numeric aliases must not be treated as independent products
## Change
Added `packages/scraper/src/utils/quarantine-fs-sku-aliases.ts`.
Script:
```bash
pnpm -C packages/scraper run verify:fs:sku-aliases
```
Apply mode:
```bash
FS_SKU_ALIAS_APPLY=1 pnpm -C packages/scraper run verify:fs:sku-aliases
```
## Safety Gates
A row is quarantined only when:
- vendor is `FS.COM`
- part number matches `^FS-[0-9]+$`
- the same normalized FS product URL has a canonical non-numeric product P/N row
- the canonical row already has price, image, and details verified
The robot sets the numeric alias row to `category='NonTransceiver'`, clears verification flags, and writes `artifact_quarantine` evidence.
## Live Result
On Erik:
- dry-run found `109` candidates
- apply quarantined `109`
- evidence ledger wrote `109` `artifact_quarantine` records
- active numeric-SKU duplicates with canonical product row after run: `0`
Post-run reconcile and matcher completed.
Live health after cleanup:
- active products: `17305`
- price verified: `11414`
- image verified: `12016`
- details verified: `16705`
- fully verified: `10448`
- competitor status:
- `matched=10775`
- `no_valid_match=73`
- `ambiguous=192`
- `needs_research=6265`
Fully populated products still needing competitor research:
- `Flexoptix=359`
- `FS.COM=4`
- `ATGBICS=2`
FS.com and Flexoptix no-valid-match dry-runs now both return `0`, so the remaining rows need true candidate research or normalization, not blind no-match closure.