transceiver-db/sync/history/2026-05-09-tip-fs-sku-alias-quarantine.md
2026-05-09 23:35:01 +02:00

2.1 KiB

2026-05-09 — TIP FS.com Numeric SKU Alias Quarantine

Problem

FS.com pages expose two identifiers:

  • marketplace SKU, for example FS-380881
  • real optical product P/N, for example OSFP-DR8-1.6T-FL

Older scraper passes created active duplicate rows for both. This polluted equivalence research because numeric SKU rows looked like separate transceivers.

The user-reported 1.6T case confirmed the issue:

  • FS-380881 is the numeric SKU alias for OSFP-DR8-1.6T-FL (500m)
  • FS-380883 is the numeric SKU alias for OSFP-2FR4-1.6T-FL (2km)
  • both real product P/N rows must remain in TIP
  • the numeric aliases must not be treated as independent products

Change

Added packages/scraper/src/utils/quarantine-fs-sku-aliases.ts.

Script:

pnpm -C packages/scraper run verify:fs:sku-aliases

Apply mode:

FS_SKU_ALIAS_APPLY=1 pnpm -C packages/scraper run verify:fs:sku-aliases

Safety Gates

A row is quarantined only when:

  • vendor is FS.COM
  • part number matches ^FS-[0-9]+$
  • the same normalized FS product URL has a canonical non-numeric product P/N row
  • the canonical row already has price, image, and details verified

The robot sets the numeric alias row to category='NonTransceiver', clears verification flags, and writes artifact_quarantine evidence.

Live Result

On Erik:

  • dry-run found 109 candidates
  • apply quarantined 109
  • evidence ledger wrote 109 artifact_quarantine records
  • active numeric-SKU duplicates with canonical product row after run: 0

Post-run reconcile and matcher completed.

Live health after cleanup:

  • active products: 17305
  • price verified: 11414
  • image verified: 12016
  • details verified: 16705
  • fully verified: 10448
  • competitor status:
    • matched=10775
    • no_valid_match=73
    • ambiguous=192
    • needs_research=6265

Fully populated products still needing competitor research:

  • Flexoptix=359
  • FS.COM=4
  • ATGBICS=2

FS.com and Flexoptix no-valid-match dry-runs now both return 0, so the remaining rows need true candidate research or normalization, not blind no-match closure.