165 Commits

Author SHA1 Message Date
Rene Fichtmueller
8c6df1028d merge: reconcile Erik-deployed fixes with Gitea feature branch (origin/main)
Merges 28 Gitea commits (Transceiver Academy, dashboard features A-P, equivalence
matchers OPN/spec, Abverkauf velocity engine, Flexoptix detail sync, SQL 111-118,
MCP equivalences tool, training data) with 22 Erik-local commits (price chart,
fmtSpd, supply-squeeze per-SKU fix, research-robot actions, NADDOD warehouse,
FS competitor stock, reorder bloat fix, flexoptix empty-token fix, vendor
reliability, blog WireGuard failover).

Only 2 files changed on both sides; ort auto-merged both cleanly. Manually
resolved a resulting duplicate GET /api/hype-cycle/market-signals: kept the
Erik handler (technologies+marketSignalScore+drivers+globalContext, matches the
merged dashboard), removed the Gitea Multi-source variant (preserved in history).

All 3 packages build with 0 TS errors. Safety: branch erik-pre-reconcile-20260607,
bundle _RECON_full-backup-20260607.bundle (also on Fearghas), bulk-price WIP in
stash@{0}, untracked SQL backed up to _RECON_untracked_sql_20260607/.
2026-06-07 05:01:32 +00:00
Rene Fichtmueller
bb4046bb1e fix(reorder-signals): delete-before-insert prevents unbounded table growth
reorder_signals grew to 4.49M rows / 1.19GB — the compute job INSERTed a fresh
row per transceiver every 4h run but never deleted old ones (24h TTL filtered
them at read time via DISTINCT ON + expires_at, but they were never purged).
4.37M rows were already expired dead weight.

Fix: DELETE existing rows for a transceiver before inserting its new signal, so
the table holds exactly one (latest) row per transceiver. Cleaned up to 18,175
rows / 4.5MB (99.6% reclaimed, VACUUM FULL). Backup: reorder_signals_keep_bak_20260606.
Verified: re-running compute:reorder-signals keeps count stable at 18,175.
2026-06-06 21:39:57 +00:00
Rene Fichtmueller
2432c0ddfc fix(flexoptix-sync): empty FLEXOPTIX_API_TOKEN string short-circuited bearer login
Root cause of the persistent sync:flexoptix-catalog HTTP 401: line 397 used
'?? null' which only coerces null/undefined. With FLEXOPTIX_API_TOKEN='' (empty
string set in .env), token stayed '' and line 485's 'token ?? getBearerToken()'
returned '' instead of performing the username/password login — sending an empty
'Bearer ' header that the products endpoint rejected with 401.

Fix: '|| null' coerces empty string to null so the bearer-login fallback fires.
Verified: sync now completes (username/password -> customer token -> products 200,
3 products/price/stock writes on limit=50). Credentials were correct all along.
2026-06-06 17:06:22 +00:00
Rene Fichtmueller
03fdfa7d51 feat(naddod-scraper): extract per-warehouse stock breakdown + write warehouse_global_qty
Adds parseWarehouseStock() to decode the HTML-entity-encoded warehouse_stock JSON
(us/nl/sg/cn per-region array). When the static page has warehouse data, writes:
  warehouse_de_qty   ← nl (EU-closest warehouse)
  warehouse_global_qty ← sum(us+nl+sg+cn), or falls back to quantity_available
  stock_confidence   ← 3 (L3) when warehouse breakdown available, else 2

Note: per-warehouse quantities require JS execution to populate (API-loaded);
static HTML has [0,0] placeholders. The fallback ensures NADDOD global totals
appear in the competitor-by-tech dashboard comparison.
2026-06-06 16:34:02 +00:00
Rene Fichtmueller
b5925bc264 Merge remote-tracking branch 'origin/erik-live-2026-06-04' into reconcile-2026-06-04
# Conflicts:
#	CHANGELOG_PENDING.md
#	packages/api/src/index.ts
#	packages/api/src/llm/client.ts
#	packages/api/src/routes/blog.ts
#	packages/api/src/routes/bulk-price.ts
#	packages/api/src/routes/kb.ts
#	packages/api/src/routes/price-matrix.ts
#	packages/api/src/routes/procurement.ts
#	packages/api/src/routes/stock.ts
#	packages/api/src/routes/vendor-reliability.ts
#	packages/api/src/routes/vendors.ts
#	packages/dashboard/index.html
2026-06-04 13:56:42 +00:00
Rene Fichtmueller
172ec324f2 snapshot: Erik live-deployed state 2026-06-04 (capture for source reconciliation) 2026-06-04 12:46:49 +00:00
Rene Fichtmueller
0d7a92e749 feat: Abverkauf velocity engine — sql/118 + analyzer + API endpoints
- sql/118-stock-velocity.sql: new stock_velocity (UPSERT per tx×vendor)
  and stock_velocity_events tables with TimescaleDB-compatible indexes
- stock-velocity-analyzer.ts: computes sell-through from stock_observations
  time-series; detects sold/zulauf/data_gap events, trims top-10% outliers,
  predicts stockout date, assigns high/medium/low/insufficient confidence
- scheduler.ts: analyze:stock:velocity job at 04:30/12:30/20:30 UTC
- stock.ts: GET /api/stock/velocity (paginated, filterable by vendor/confidence/
  stockout_days) + GET /api/stock/velocity/:id (per-product with event history)
- First run: 208 products, 979 sell events, 2811 Zulauf events written
2026-05-14 00:24:58 +02:00
Rene Fichtmueller
637839e965 feat: add stock observations to ATGBICS + Optcore; delete demo data
- DELETE 2133 rows from reorder_signals WHERE is_demo_data = true
- atgbics.ts: add upsertStockObservation (confidence=1, binary available
  boolean from Shopify API; quantityAvailable 1/0 for in/out stock)
- optcore.ts: add upsertStockObservation (confidence=1, WooCommerce text
  stock level parsed via parseStockLevel; quantityAvailable 1/0)
- Both scrapers already run every 2h on Erik scheduler
- FS.com: already captures full warehouse breakdown (DE+Global+backorder)
  3x/day from Mac (02:00/10:00/18:00) at confidence=3 — no change needed
- QSFPTEK: already captures real quantities at confidence=2 — no change
- sfpcables/prolabs/wiitek: no meaningful stock signal, not modified
2026-05-14 00:08:57 +02:00
Rene Fichtmueller
db6b97186a feat: OPN+spec equivalence matchers, 400G pricing, TIP_LLM training data
- Add OPN-based equivalence matcher robot (7,245 manufacturer-confirmed matches, confidence=1.0)
- Add spec-based equivalence matcher robot (683 matches, confidence=0.85)
  - Matches by form_factor + speed_gbps + reach_tier + wavelength ±10nm
  - Safety cap: skip FX products matching >30 competitors (too generic)
  - Daily schedule: 04:30 UTC via pg-boss
- SQL migrations 116 (OPN) + 117 (spec) with tip_extract_wavelength_nm() + tip_reach_tier() helpers
- Fix tenGtek.ts: add 3 missing 400G categories (QSFP-DD, QSFP112) — closes pricing gap
- Generate tip-llm-pricing-v1.jsonl: 80 DB-grounded QA pairs (pricing, equivalences, 400G)
- Rebuild TIP_LLM training pool: 11,999 pairs (+127 vs prev), deployed to Erik
- FX product equivalence coverage: 88.1% (959/1089)
2026-05-13 21:33:19 +02:00
Rene Fichtmueller
2f85571784 feat: Flexoptix full product detail sync (sql/115 + detail-enricher robot)
Pulls complete per-SKU specifications and compatibility data from the
Flexoptix API (specifications=1&compatibilities=1) and writes structured
fields to the transceivers table for datasheet generation.

SQL migration 115:
- Adds fx_specifications JSONB (raw spec blob for datasheet gen)
- Adds fx_compatibilities JSONB (full OEM compatibility matrix)
- Adds compliance_code, laser_type, receiver_type, supported_protocols[]
- Adds extinction_ratio_db, cdr_support, inbuilt_fec, detail_synced_at
- GIN index on fx_compatibilities for vendor/OPN queries

flexoptix-detail-enricher.ts:
- Per-SKU API calls with rate-limiting (600ms/call, 100 SKUs/run)
- Parses all spec labels → structured fields (power, budget, tx/rx dBm,
  modulation, wavelengths, temp range, DOM, laser type, receiver type)
- Strips :Sx variant suffixes before API queries (self-configure SKUs)
- COALESCE writes — never overwrites existing data, only fills gaps
- Tracks detail_synced_at, retries stale entries after 7 days

flexoptix-api-sync.ts:
- Also stores image_url and product_page_url during bulk sync

scheduler.ts:
- Registers enrich:flexoptix-details daily at 03:00 UTC

Results after initial run:
- 791/968 FX products (81.7%) fully enriched
- 26.0 avg compatibility entries per product (OEM vendor + OPN)
- 25.7 avg spec fields per product
- DFB(483), EML(148), FP(72), VCSEL(44) laser type distribution
2026-05-13 18:49:28 +02:00
Rene Fichtmueller
d1bde66e39 feat: deterministic equivalence matcher + full wavelength/connector enrichment
Replace confidence-based matcher with deterministic 6-field exact match:
- form_factor (exact), speed_gbps (±0.1G), fiber_type (exact),
  reach (±10%), wavelength_tx (±5nm), connector_type (exact)
- Complete products → confidence=1.0, never creates pending records
- Incomplete products → enhanced confidence ≥0.85, still auto_approved
- PENDING CREATED: 0 (by design, permanent)

Migrations:
- sql/113: Connector type inference from IEEE lookup + form-factor rules
  (970→479 missing connector for FX products)
- sql/114: Extend IEEE lookup with 400G/800G/1.6T OSFP/QSFP-DD standards,
  wavelength fallback (SMF→1310nm, MMF→850nm), clear pending queue to 0

Enrichment results (before→after):
- FX fully complete: 50 → 555 / 1,089 (+505)
- Total fully complete: ~3,600 → 15,431 / 18,133 (+11,800)
- FX coverage: 54.7% → 55.8% (608/1,089 matched)
- Deterministic matches: 0 → 44,596 (confidence=1.0)
- Wavelength-mismatched records rejected: 521
- Pending queue: 42 → 0 (permanent)

New match stats:
- 55,743 new deterministic auto_approved matches
- 521 legacy wavelength-mismatch records rejected
- Total active: 53,447 auto_approved + 1,987 approved
2026-05-13 17:59:08 +02:00
Rene Fichtmueller
9979b79434 feat: wavelength/connector enrichment schema + enricher robot
- sql/110: add wavelength_tx_nm, wavelength_rx_nm, connector_type,
  data_completeness, enrichment_needed columns + trigger
- sql/111: IEEE/MSA standards wavelength lookup table (SFP→OSFP)
- sql/112: migrate existing wavelengths TEXT → integer columns
- robots/wavelength-enricher.ts: fills missing wavelengths from IEEE
  lookup (deterministic) then product-name regex, runs every 4h
- scheduler: register enrich:wavelength job (4h schedule)

Fixes over-broad matching where 1G SFPs match 500+ competitors
due to missing wavelength discrimination.
2026-05-13 17:35:42 +02:00
Rene Fichtmueller
1edd6c20a8 fix: use COUNT(*) instead of COUNT(DISTINCT po.id) in catalog-reconcile
price_observations table has no id column — replace with COUNT(*)
to avoid SQL error 42703.
2026-05-13 16:59:49 +02:00
Rene Fichtmueller
98b241f462 feat: implement Flexoptix reference matching overhaul
- sql/108: normalize form_factor across all vendors (SFP-Plus → SFP+, etc.)
  and round speed_gbps for consistent matching
- sql/109: document 30→90 day matcher window change
- robots/catalog-reconcile.ts: new bulk-reconcile robot — matches all
  Flexoptix products against all competitors without 30-day time limit
- scheduler.ts: register catalog:reconcile job (monthly + on-demand),
  fix nightly matcher 30→90 day window, UPPER() form_factor matching,
  ROUND() speed_gbps matching

Fixes: ATGBICS/NADDOD/10Gtek/ShopFiber24 had 0 Flexoptix equivalences
due to stale price_observations being filtered out. Expected coverage
improvement: 22% → 45-60% after first reconcile run.
2026-05-13 16:55:45 +02:00
Rene Fichtmueller
a20094755d feat(scraper): Flexoptix REST API sync robot + scheduler integration
Replaces the GraphQL/search-based Flexoptix scraper with a proper
Magento 2 REST API integration that delivers authoritative SKUs,
prices, stock levels and compatibility data.

New files:
- packages/scraper/src/robots/flexoptix-api-sync.ts
  Self-contained robot: auth → paginated fetch → normalize → DB write.
  Reads FLEXOPTIX_API_BASE_URL / _USERNAME / _PASSWORD from env.
  Returns { fetched, normalized, skipped, priceWrites, stockWrites }.
  No file intermediary — in-memory pipeline.

- scripts/import-flexoptix-catalog.ts
  One-shot CLI importer for the Pulso-generated JSONL (Codex handover).

- docs/FLEXOPTIX_CATALOG_IMPORT.md
  Runbook for manual import + per-SKU specifications enrichment.

Scheduler changes:
- Added sync:flexoptix-catalog queue + work() handler
- Scheduled every 2h at 0 */2 * * * (same cadence as legacy job)
- scrape:pricing:flexoptix kept as legacy GraphQL fallback

Also includes Codex-generated additions from this sprint:
- audiocodes-oem scraper, seed-batch35/36/37, db.ts improvements,
  sql/102 verification reconcile, README + package.json updates
2026-05-13 16:36:33 +02:00
Rene Fichtmueller
5eb1b07183 fix: close stale TIP manual review queue 2026-05-10 10:23:07 +02:00
Rene Fichtmueller
cf0e471fa4 feat: close TIP research resolution states 2026-05-10 10:13:09 +02:00
Rene Fichtmueller
0edc6e3f3a feat: Pi scraper fleet — fetch-only index-pi.ts + FS.COM/NADDOD via SOCKS5
- index-pi.ts: removed Playwright scrapers (FS.COM, eBay enricher, switch assets)
  added NADDOD (fetch-based, benefits from residential IP)
  now 32 fetch-only queues safe for ARM/Pi without Chromium
- index-fs-only.ts: new dedicated FS.COM + NADDOD worker for Erik
  routes through Pi SOCKS5 via PROXY_URLS=socks5://10.10.0.6:1080
  Crawlee ProxyConfiguration automatically applies to Playwright crawler
- pi-scraper-setup.sh: removed inline index-pi.ts override (repo version now authoritative)
- CODEX-TASK-pi-scraper-deploy.md: full 9-step Codex spec for Pi fleet setup
  covers WireGuard keypair, Erik peer config, setup script, ecosystem.config.js
- CODEX-TASK-zero-manual-review.md: deterministic equivalence matcher spec
2026-05-10 09:53:55 +02:00
Rene Fichtmueller
7e36236d2b fix: quarantine GAO catalog artifacts 2026-05-10 09:48:43 +02:00
Rene Fichtmueller
d691745c7b feat: clean TIP cable rows from active base 2026-05-10 09:41:59 +02:00
Rene Fichtmueller
2be61f2441 feat: close TIP retail price research states 2026-05-10 01:42:24 +02:00
Rene Fichtmueller
b58f7cee41 feat: resolve OEM price status and part details 2026-05-10 01:16:49 +02:00
Rene Fichtmueller
adb2661fac feat: add targeted product page asset verifier 2026-05-10 00:31:33 +02:00
Rene Fichtmueller
0d4bcb6924 fix: preserve explicit competitor states in reconcile 2026-05-10 00:17:26 +02:00
Rene Fichtmueller
635a102932 feat: close open competitor research states 2026-05-10 00:03:42 +02:00
Rene Fichtmueller
fb9db56617 fix: quarantine fs numeric sku aliases 2026-05-09 23:35:01 +02:00
Rene Fichtmueller
79a57a5ac6 feat: add no-valid competitor resolver 2026-05-09 23:16:04 +02:00
Rene Fichtmueller
650de6ba9a feat: add verification evidence state model 2026-05-09 23:06:21 +02:00
Rene Fichtmueller
1af4f090f7 fix: harden TIP verification cleanup 2026-05-09 22:16:29 +02:00
Rene Fichtmueller
a43e572946 fix: advance TIP product verification robots 2026-05-09 20:19:19 +02:00
Rene Fichtmueller
ec40a96ae0 feat: add vendor detail verifiers 2026-05-09 18:22:09 +02:00
Rene Fichtmueller
91a1c2282a fix: harden atgbics evidence parsing 2026-05-09 17:30:08 +02:00
Rene Fichtmueller
c2421c03a3 fix: harden shopfiber24 reach parsing 2026-05-09 17:24:06 +02:00
Rene Fichtmueller
bb9c495497 fix: verify qsfptek cable details 2026-05-09 17:03:35 +02:00
Rene Fichtmueller
fc18b00157 fix: verify copper cable semantics 2026-05-09 16:55:50 +02:00
Rene Fichtmueller
c25300199a fix: harden atgbics wavelength semantics 2026-05-09 16:41:18 +02:00
Rene Fichtmueller
b26696f0d1 fix: improve vendor verification and fscom 1.6t variants 2026-05-09 15:56:08 +02:00
Rene Fichtmueller
60531b6250 feat: add crawlee python worker integration 2026-05-09 14:06:34 +02:00
Rene Fichtmueller
3d79f6b8e0 fix: add fscom url discovery mode 2026-05-09 14:00:30 +02:00
Rene Fichtmueller
f64dbf7b6b fix: add fscom targeted detail verification mode 2026-05-09 11:15:36 +02:00
Rene Fichtmueller
549b4430df fix: enrich flexoptix detail verification 2026-05-09 09:36:28 +02:00
Rene Fichtmueller
5522bb2152 fix: refresh price verification timestamps 2026-05-09 08:13:39 +02:00
Rene Fichtmueller
43b7250180 fix: automate equivalence research review queue 2026-05-09 07:48:11 +02:00
Rene Fichtmueller
ef225c7dc5 fix: revalidate flexoptix fs prices and images 2026-05-09 05:13:37 +02:00
Rene Fichtmueller
57e20efe49 fix: NADDOD price extraction — read from LD+JSON offers.price
NADDOD uses LD+JSON for pricing (Astro/Shopify structure):
  {"offers":{"price":"731.00","priceCurrency":"USD",...}}

Old regex (/US$\s*.../) never matched → all 132 price obs were lucky
text matches, not systematic. Now: parse all ld+json blocks first,
fall back to regex.

Also broaden sitemap URL regex to capture new-style URLs without .html:
  /products/nvidia-networking/102612 (was being missed)
2026-05-06 23:55:55 +02:00
Rene Fichtmueller
1a7c928120 fix: FS.COM price extraction — use .no_tax/.price CSS selectors
FS.com changed their HTML structure; compound class names are gone.
Current layout (verified 2026-05-06):
  <div class="no_tax">5,10 € ohne MwSt.</div>  ← B2B net price (preferred)
  <div class="price">6,07 €</div>               ← gross fallback
  <div class="standard_price">6,07 €</div>      ← gross fallback

Old selectors ([class*='price-value'] etc.) matched nothing → all prices
stored as €? null. New .no_tax first gives us the correct net/B2B price.
2026-05-06 23:45:30 +02:00
Rene Fichtmueller
a1a525b332 chore: sync API routes, dashboard hot-topics, MCP server, scraper package, scripts 2026-05-06 23:39:04 +02:00
Rene Fichtmueller
a8529d166b fix: resolve TS build errors — export backfillImages, add writeRobotExperience
- backfill-images.ts: rename main() → export backfillImages() to match index.ts import
- training-data-writer.ts: add writeRobotExperience export; remove hardcoded Gitea token
- fiber24.ts/fibermall.ts: scraper improvements from previous sessions
- image-downloader.ts/spec-updater.ts: utility updates
- robots/: add verification robots module
2026-05-06 23:39:00 +02:00
Rene Fichtmueller
5a77fce9f3 feat: NADDOD cursor rotation — covers all 7300+ URLs across 12 runs (24h)
Previously always sliced first 600 URLs from sitemap, missing 6700+ products.
Now stores offset in naddod-cursor.json, advances by 600 per run with wrap-around.
Full sitemap coverage in ~13 runs (26h). Also adds TIP_STORAGE_DIR env support.
2026-05-06 23:26:58 +02:00
Rene Fichtmueller
efb0c24a19 feat: rewrite ATGBICS scraper to use Shopify products.json API
Static HTML collection pages return wrong results (all redirect to same 9 products).
Switch to /collections/{handle}/products.json?limit=250&page=N API which is:
- Reliable JSON (no HTML parsing)
- Correct per-collection product lists
- Clean pagination (stop at < limit results)
- Covers 11 key transceiver collections (1G, 10G, 25G, 40G, 100G, 400G)
2026-05-06 23:17:46 +02:00