77 lines
2.1 KiB
Markdown
77 lines
2.1 KiB
Markdown
# 2026-05-09 — TIP FS.com Numeric SKU Alias Quarantine
|
|
|
|
## Problem
|
|
|
|
FS.com pages expose two identifiers:
|
|
|
|
- marketplace SKU, for example `FS-380881`
|
|
- real optical product P/N, for example `OSFP-DR8-1.6T-FL`
|
|
|
|
Older scraper passes created active duplicate rows for both. This polluted equivalence research because numeric SKU rows looked like separate transceivers.
|
|
|
|
The user-reported 1.6T case confirmed the issue:
|
|
|
|
- `FS-380881` is the numeric SKU alias for `OSFP-DR8-1.6T-FL` (500m)
|
|
- `FS-380883` is the numeric SKU alias for `OSFP-2FR4-1.6T-FL` (2km)
|
|
- both real product P/N rows must remain in TIP
|
|
- the numeric aliases must not be treated as independent products
|
|
|
|
## Change
|
|
|
|
Added `packages/scraper/src/utils/quarantine-fs-sku-aliases.ts`.
|
|
|
|
Script:
|
|
|
|
```bash
|
|
pnpm -C packages/scraper run verify:fs:sku-aliases
|
|
```
|
|
|
|
Apply mode:
|
|
|
|
```bash
|
|
FS_SKU_ALIAS_APPLY=1 pnpm -C packages/scraper run verify:fs:sku-aliases
|
|
```
|
|
|
|
## Safety Gates
|
|
|
|
A row is quarantined only when:
|
|
|
|
- vendor is `FS.COM`
|
|
- part number matches `^FS-[0-9]+$`
|
|
- the same normalized FS product URL has a canonical non-numeric product P/N row
|
|
- the canonical row already has price, image, and details verified
|
|
|
|
The robot sets the numeric alias row to `category='NonTransceiver'`, clears verification flags, and writes `artifact_quarantine` evidence.
|
|
|
|
## Live Result
|
|
|
|
On Erik:
|
|
|
|
- dry-run found `109` candidates
|
|
- apply quarantined `109`
|
|
- evidence ledger wrote `109` `artifact_quarantine` records
|
|
- active numeric-SKU duplicates with canonical product row after run: `0`
|
|
|
|
Post-run reconcile and matcher completed.
|
|
|
|
Live health after cleanup:
|
|
|
|
- active products: `17305`
|
|
- price verified: `11414`
|
|
- image verified: `12016`
|
|
- details verified: `16705`
|
|
- fully verified: `10448`
|
|
- competitor status:
|
|
- `matched=10775`
|
|
- `no_valid_match=73`
|
|
- `ambiguous=192`
|
|
- `needs_research=6265`
|
|
|
|
Fully populated products still needing competitor research:
|
|
|
|
- `Flexoptix=359`
|
|
- `FS.COM=4`
|
|
- `ATGBICS=2`
|
|
|
|
FS.com and Flexoptix no-valid-match dry-runs now both return `0`, so the remaining rows need true candidate research or normalization, not blind no-match closure.
|