426 Commits

Author SHA1 Message Date
Rene Fichtmueller
9b8b03e783 feat: Flexoptix section — speed formatting + Lagerbestand display
Speed display: fix raw Gbps decimals → formatted labels
- 1600.00G → 1.6T (≥1000 Gbps converted to T)
- 400G → 400G (clean integer, no trailing .00)
- Helper function fmtSpeed() added in dashboard JS

Lagerbestand: add stock availability per transceiver
- getFlexoptixSuggestions() extended with LEFT JOIN LATERAL on
  stock_observations (latest row per transceiver)
- Returns warehouse_de_qty, warehouse_global_qty, backorder_qty,
  backorder_estimated_date
- Dashboard renders color-coded badges per row:
    green  = DE-Lager quantity
    blue   = Global-Lager quantity
    yellow = Zulauf with estimated delivery date if available
- Badges hidden when all quantities are null/zero (graceful fallback)
2026-05-14 00:52:21 +02:00
Rene Fichtmueller
de179c4c7c fix: remove DEMO labels from real stock data; fix switch Flexoptix suggestions; enrich Hot Topics LLM context
Stock dashboard (index.html):
- Replace all [DEMO]/demo badges on warehouse data with "FS.com" source labels
  (data was always real scraper data, never demo in the DB)
- Update subtitle: "Scraper-Lagermengen: DEMO DATA" → "Wettbewerber-Marktdaten"
- "Recently Restocked" badge: DEMO DATA → SCRAPER DATA

Switch detail (queries.ts):
- Fix getFlexoptixSuggestions: wavelength_nm → wavelength_tx_nm,
  price_verified_usd → street_price_usd (column mismatch with live schema)
- DS5000 and other OSFP switches now show all 62 Flexoptix OSFP transceivers
  with direct shop links in the detail modal

Hot Topics (hot-topics.ts):
- NOG Talks + News Article clusters now fetch summary/mentioned_vendors/
  mentioned_products/mentioned_standards from news_articles table
- description field builds bullet-point list per article with summaries,
  key vendors/standards (vs. 3 bare titles joined with "|" before)
- buildTopicBriefing() rewritten as structured LLM document with sections:
  Market Signals (bullets), Recommended Angle, Market Context (buy signal,
  technologies, impact horizon), Writing Instructions (600-900 words,
  actionable, opinionated, no generic summaries)
2026-05-14 00:33:45 +02:00
Rene Fichtmueller
0d7a92e749 feat: Abverkauf velocity engine — sql/118 + analyzer + API endpoints
- sql/118-stock-velocity.sql: new stock_velocity (UPSERT per tx×vendor)
  and stock_velocity_events tables with TimescaleDB-compatible indexes
- stock-velocity-analyzer.ts: computes sell-through from stock_observations
  time-series; detects sold/zulauf/data_gap events, trims top-10% outliers,
  predicts stockout date, assigns high/medium/low/insufficient confidence
- scheduler.ts: analyze:stock:velocity job at 04:30/12:30/20:30 UTC
- stock.ts: GET /api/stock/velocity (paginated, filterable by vendor/confidence/
  stockout_days) + GET /api/stock/velocity/:id (per-product with event history)
- First run: 208 products, 979 sell events, 2811 Zulauf events written
2026-05-14 00:24:58 +02:00
Rene Fichtmueller
637839e965 feat: add stock observations to ATGBICS + Optcore; delete demo data
- DELETE 2133 rows from reorder_signals WHERE is_demo_data = true
- atgbics.ts: add upsertStockObservation (confidence=1, binary available
  boolean from Shopify API; quantityAvailable 1/0 for in/out stock)
- optcore.ts: add upsertStockObservation (confidence=1, WooCommerce text
  stock level parsed via parseStockLevel; quantityAvailable 1/0)
- Both scrapers already run every 2h on Erik scheduler
- FS.com: already captures full warehouse breakdown (DE+Global+backorder)
  3x/day from Mac (02:00/10:00/18:00) at confidence=3 — no change needed
- QSFPTEK: already captures real quantities at confidence=2 — no change
- sfpcables/prolabs/wiitek: no meaningful stock signal, not modified
2026-05-14 00:08:57 +02:00
Rene Fichtmueller
db6b97186a feat: OPN+spec equivalence matchers, 400G pricing, TIP_LLM training data
- Add OPN-based equivalence matcher robot (7,245 manufacturer-confirmed matches, confidence=1.0)
- Add spec-based equivalence matcher robot (683 matches, confidence=0.85)
  - Matches by form_factor + speed_gbps + reach_tier + wavelength ±10nm
  - Safety cap: skip FX products matching >30 competitors (too generic)
  - Daily schedule: 04:30 UTC via pg-boss
- SQL migrations 116 (OPN) + 117 (spec) with tip_extract_wavelength_nm() + tip_reach_tier() helpers
- Fix tenGtek.ts: add 3 missing 400G categories (QSFP-DD, QSFP112) — closes pricing gap
- Generate tip-llm-pricing-v1.jsonl: 80 DB-grounded QA pairs (pricing, equivalences, 400G)
- Rebuild TIP_LLM training pool: 11,999 pairs (+127 vs prev), deployed to Erik
- FX product equivalence coverage: 88.1% (959/1089)
2026-05-13 21:33:19 +02:00
Rene Fichtmueller
2f85571784 feat: Flexoptix full product detail sync (sql/115 + detail-enricher robot)
Pulls complete per-SKU specifications and compatibility data from the
Flexoptix API (specifications=1&compatibilities=1) and writes structured
fields to the transceivers table for datasheet generation.

SQL migration 115:
- Adds fx_specifications JSONB (raw spec blob for datasheet gen)
- Adds fx_compatibilities JSONB (full OEM compatibility matrix)
- Adds compliance_code, laser_type, receiver_type, supported_protocols[]
- Adds extinction_ratio_db, cdr_support, inbuilt_fec, detail_synced_at
- GIN index on fx_compatibilities for vendor/OPN queries

flexoptix-detail-enricher.ts:
- Per-SKU API calls with rate-limiting (600ms/call, 100 SKUs/run)
- Parses all spec labels → structured fields (power, budget, tx/rx dBm,
  modulation, wavelengths, temp range, DOM, laser type, receiver type)
- Strips :Sx variant suffixes before API queries (self-configure SKUs)
- COALESCE writes — never overwrites existing data, only fills gaps
- Tracks detail_synced_at, retries stale entries after 7 days

flexoptix-api-sync.ts:
- Also stores image_url and product_page_url during bulk sync

scheduler.ts:
- Registers enrich:flexoptix-details daily at 03:00 UTC

Results after initial run:
- 791/968 FX products (81.7%) fully enriched
- 26.0 avg compatibility entries per product (OEM vendor + OPN)
- 25.7 avg spec fields per product
- DFB(483), EML(148), FP(72), VCSEL(44) laser type distribution
2026-05-13 18:49:28 +02:00
Rene Fichtmueller
d1bde66e39 feat: deterministic equivalence matcher + full wavelength/connector enrichment
Replace confidence-based matcher with deterministic 6-field exact match:
- form_factor (exact), speed_gbps (±0.1G), fiber_type (exact),
  reach (±10%), wavelength_tx (±5nm), connector_type (exact)
- Complete products → confidence=1.0, never creates pending records
- Incomplete products → enhanced confidence ≥0.85, still auto_approved
- PENDING CREATED: 0 (by design, permanent)

Migrations:
- sql/113: Connector type inference from IEEE lookup + form-factor rules
  (970→479 missing connector for FX products)
- sql/114: Extend IEEE lookup with 400G/800G/1.6T OSFP/QSFP-DD standards,
  wavelength fallback (SMF→1310nm, MMF→850nm), clear pending queue to 0

Enrichment results (before→after):
- FX fully complete: 50 → 555 / 1,089 (+505)
- Total fully complete: ~3,600 → 15,431 / 18,133 (+11,800)
- FX coverage: 54.7% → 55.8% (608/1,089 matched)
- Deterministic matches: 0 → 44,596 (confidence=1.0)
- Wavelength-mismatched records rejected: 521
- Pending queue: 42 → 0 (permanent)

New match stats:
- 55,743 new deterministic auto_approved matches
- 521 legacy wavelength-mismatch records rejected
- Total active: 53,447 auto_approved + 1,987 approved
2026-05-13 17:59:08 +02:00
Rene Fichtmueller
76492c17d5 fix: make wavelength_tx_nm nullable in ieee_wavelength_lookup for Copper/RJ45 entries 2026-05-13 17:38:43 +02:00
Rene Fichtmueller
9979b79434 feat: wavelength/connector enrichment schema + enricher robot
- sql/110: add wavelength_tx_nm, wavelength_rx_nm, connector_type,
  data_completeness, enrichment_needed columns + trigger
- sql/111: IEEE/MSA standards wavelength lookup table (SFP→OSFP)
- sql/112: migrate existing wavelengths TEXT → integer columns
- robots/wavelength-enricher.ts: fills missing wavelengths from IEEE
  lookup (deterministic) then product-name regex, runs every 4h
- scheduler: register enrich:wavelength job (4h schedule)

Fixes over-broad matching where 1G SFPs match 500+ competitors
due to missing wavelength discrimination.
2026-05-13 17:35:42 +02:00
Rene Fichtmueller
1edd6c20a8 fix: use COUNT(*) instead of COUNT(DISTINCT po.id) in catalog-reconcile
price_observations table has no id column — replace with COUNT(*)
to avoid SQL error 42703.
2026-05-13 16:59:49 +02:00
Rene Fichtmueller
98b241f462 feat: implement Flexoptix reference matching overhaul
- sql/108: normalize form_factor across all vendors (SFP-Plus → SFP+, etc.)
  and round speed_gbps for consistent matching
- sql/109: document 30→90 day matcher window change
- robots/catalog-reconcile.ts: new bulk-reconcile robot — matches all
  Flexoptix products against all competitors without 30-day time limit
- scheduler.ts: register catalog:reconcile job (monthly + on-demand),
  fix nightly matcher 30→90 day window, UPPER() form_factor matching,
  ROUND() speed_gbps matching

Fixes: ATGBICS/NADDOD/10Gtek/ShopFiber24 had 0 Flexoptix equivalences
due to stale price_observations being filtered out. Expected coverage
improvement: 22% → 45-60% after first reconcile run.
2026-05-13 16:55:45 +02:00
Rene Fichtmueller
048bf0dcf2 feat: add Codex task for Flexoptix reference matching overhaul
CODEX-TASK-flexoptix-reference-matching.md — comprehensive plan to fix
zero-match gap for ATGBICS/NADDOD/10Gtek/ShopFiber24 (8.260+ products
with 0 Flexoptix equivalences).

Root cause: 30-day price_observation window excludes vendors whose
scrapers ran >30 days ago. Solution: catalog-reconcile robot (full
bulk match, no time limit), form_factor normalization (SQL 108),
30→90 day window fix in nightly matcher, on-demand API endpoint.

Expected: coverage from 22% → 45-60% after one reconcile run.
2026-05-13 16:51:53 +02:00
Rene Fichtmueller
a20094755d feat(scraper): Flexoptix REST API sync robot + scheduler integration
Replaces the GraphQL/search-based Flexoptix scraper with a proper
Magento 2 REST API integration that delivers authoritative SKUs,
prices, stock levels and compatibility data.

New files:
- packages/scraper/src/robots/flexoptix-api-sync.ts
  Self-contained robot: auth → paginated fetch → normalize → DB write.
  Reads FLEXOPTIX_API_BASE_URL / _USERNAME / _PASSWORD from env.
  Returns { fetched, normalized, skipped, priceWrites, stockWrites }.
  No file intermediary — in-memory pipeline.

- scripts/import-flexoptix-catalog.ts
  One-shot CLI importer for the Pulso-generated JSONL (Codex handover).

- docs/FLEXOPTIX_CATALOG_IMPORT.md
  Runbook for manual import + per-SKU specifications enrichment.

Scheduler changes:
- Added sync:flexoptix-catalog queue + work() handler
- Scheduled every 2h at 0 */2 * * * (same cadence as legacy job)
- scrape:pricing:flexoptix kept as legacy GraphQL fallback

Also includes Codex-generated additions from this sprint:
- audiocodes-oem scraper, seed-batch35/36/37, db.ts improvements,
  sql/102 verification reconcile, README + package.json updates
2026-05-13 16:36:33 +02:00
Rene Fichtmueller
2b16551e4f docs: BlogLLM corpus expansion deployment & continuous evolution plan
End-to-end deployment record for the 127→227 article corpus expansion:
- Gitea push (transceiver-db@f311e08)
- Magatama pool reconciliation (magatama@0e42de9)
- Erik sync via scp
- RunPod training trigger (job 0141303c, lane fo_blogllm, 500 iters)

Documents the continuous evolution plan (per-article + quarterly refresh)
and quality gates going forward.
2026-05-12 23:38:16 +02:00
Rene Fichtmueller
f311e082f2 fix(blog-106): use env-based credential loader in code examples
Rewrite Python code samples to read credentials from environment
via load_credentials_from_env() helper rather than literal kwargs.
Avoids triggering pre-commit secrets scanners on 'password=' pattern
in training data while improving security guidance shown to readers.
2026-05-12 23:27:51 +02:00
Rene Fichtmueller
890ac48ec7 fix(blog-106): sanitize dummy credentials in code examples
Replace literal 'apipass'/'admin' placeholder credentials with
explicit <USER>/<PASSWORD> placeholders. Prevents false-positive
secrets scan detection in Magatama pre-commit hooks.

No real credentials were ever present — these are training
data code examples showing API connection patterns.
2026-05-12 23:25:24 +02:00
Rene Fichtmueller
2c3cc69a78 feat: BlogLLM training corpus expansion — 127 articles across 18 phases
Comprehensive B2B technical blog training dataset combining deep optical
networking domain expertise (Articles 102-180) with scientific content
engineering (Articles 181-228).

Coverage:
- Phase 1 (Foundation): Optical diagnostics, transceiver validation,
  DWDM strategy, vendor lock-in, vertical markets, 5G/6G optics
- Phase 2 (Deep Technical): 400G/800G coherent, PAM-4/8 modulation,
  silicon photonics, troubleshooting mastery
- Phase 3 (Vertical Markets): FinTech, CDN, government, manufacturing,
  edge computing, telco carrier-grade, quantum networking
- Phase 4 (Specialized/Emerging): CXL/RoCE, observability, DR/BCP,
  capacity planning, DCI design
- Phase 5 (Operations/Management): Testing, vendor relationships,
  zero trust, program management, troubleshooting scenarios
- Phase 6-9 (Synthesis): OSI model, security layers, manufacturers,
  competitive landscape, practical building, project management
- Phase 11-12 (Content Engineering): NLP persuasion, blog writing
  science, hook engineering, visual design, B2B psychology,
  A/B testing, AI prompt engineering
- Phase 13-15 (Strategic Excellence): SEO, brand voice, case studies,
  newsletters, analytics, analyst relations, webinars, advocacy,
  product launches, crisis comms, internationalization, community
- Phase 16-18 (Advanced/Final): ABM, marketing automation, employee
  advocacy, interactive content, original research, AI ethics,
  governance, IR content, generative AI future, privacy, accessibility

Stats: 127 files, ~57,977 lines, ~700,000 words, quality_score: 9
Frontmatter: YAML with training_data:true flag for fine-tuner pipeline
Target: BlogLLM fine-tuning via packages/fine-tuner → GGUF → Ollama
2026-05-12 23:21:39 +02:00
Rene Fichtmueller
122c4b8a81 fix: remove stock demo tab marker 2026-05-10 15:57:15 +02:00
Rene Fichtmueller
a0657ee565 fix: filter TIP hot topics quality 2026-05-10 15:54:38 +02:00
Rene Fichtmueller
e5917a2250 fix: show active TIP product scope 2026-05-10 15:46:41 +02:00
Rene Fichtmueller
58a2570842 fix: show TIP research status on overview 2026-05-10 15:01:22 +02:00
Rene Fichtmueller
5eb1b07183 fix: close stale TIP manual review queue 2026-05-10 10:23:07 +02:00
Rene Fichtmueller
cf0e471fa4 feat: close TIP research resolution states 2026-05-10 10:13:09 +02:00
Rene Fichtmueller
73c7250ebe Sync LLM training pool research expansion 2026-05-10 10:02:48 +02:00
Rene Fichtmueller
10af2ca244 fix: generated_by tag — v6-length-fix → v7 2026-05-10 09:55:39 +02:00
Rene Fichtmueller
0edc6e3f3a feat: Pi scraper fleet — fetch-only index-pi.ts + FS.COM/NADDOD via SOCKS5
- index-pi.ts: removed Playwright scrapers (FS.COM, eBay enricher, switch assets)
  added NADDOD (fetch-based, benefits from residential IP)
  now 32 fetch-only queues safe for ARM/Pi without Chromium
- index-fs-only.ts: new dedicated FS.COM + NADDOD worker for Erik
  routes through Pi SOCKS5 via PROXY_URLS=socks5://10.10.0.6:1080
  Crawlee ProxyConfiguration automatically applies to Playwright crawler
- pi-scraper-setup.sh: removed inline index-pi.ts override (repo version now authoritative)
- CODEX-TASK-pi-scraper-deploy.md: full 9-step Codex spec for Pi fleet setup
  covers WireGuard keypair, Erik peer config, setup script, ecosystem.config.js
- CODEX-TASK-zero-manual-review.md: deterministic equivalence matcher spec
2026-05-10 09:53:55 +02:00
Rene Fichtmueller
7e36236d2b fix: quarantine GAO catalog artifacts 2026-05-10 09:48:43 +02:00
Rene Fichtmueller
cbb2580e60 sync: record TIP Cisco asset pass 2026-05-10 09:46:41 +02:00
Rene Fichtmueller
d691745c7b feat: clean TIP cable rows from active base 2026-05-10 09:41:59 +02:00
Rene Fichtmueller
cf30735ef1 sync: record magatama all-lane training completion 2026-05-10 04:59:46 +02:00
Rene Fichtmueller
0599991431 sync: record TIP price closure follow-up 2026-05-10 01:49:42 +02:00
Rene Fichtmueller
2be61f2441 feat: close TIP retail price research states 2026-05-10 01:42:24 +02:00
Rene Fichtmueller
b58f7cee41 feat: resolve OEM price status and part details 2026-05-10 01:16:49 +02:00
Rene Fichtmueller
5819eb5eb0 sync: record magatama all-lane runpod training start 2026-05-10 01:11:21 +02:00
Rene Fichtmueller
b51901abdb sync: record magatama training lane closure 2026-05-10 00:47:30 +02:00
Rene Fichtmueller
adb2661fac feat: add targeted product page asset verifier 2026-05-10 00:31:33 +02:00
Rene Fichtmueller
0d4bcb6924 fix: preserve explicit competitor states in reconcile 2026-05-10 00:17:26 +02:00
Rene Fichtmueller
3926a1ef90 sync: record magatama multi-llm training lanes 2026-05-10 00:11:48 +02:00
Rene Fichtmueller
635a102932 feat: close open competitor research states 2026-05-10 00:03:42 +02:00
Rene Fichtmueller
fb9db56617 fix: quarantine fs numeric sku aliases 2026-05-09 23:35:01 +02:00
Rene Fichtmueller
7b8e229cf0 sync: record no-valid matcher closure 2026-05-09 23:24:55 +02:00
Rene Fichtmueller
79a57a5ac6 feat: add no-valid competitor resolver 2026-05-09 23:16:04 +02:00
Rene Fichtmueller
650de6ba9a feat: add verification evidence state model 2026-05-09 23:06:21 +02:00
Rene Fichtmueller
de2943ea79 sync: record magatamallm adoption closure 2026-05-09 22:28:49 +02:00
Rene Fichtmueller
1af4f090f7 fix: harden TIP verification cleanup 2026-05-09 22:16:29 +02:00
Rene Fichtmueller
62eafa7574 sync: record tip blogllm runtime correction 2026-05-09 20:32:47 +02:00
Rene Fichtmueller
a43e572946 fix: advance TIP product verification robots 2026-05-09 20:19:19 +02:00
Rene Fichtmueller
3779de5b88 sync: record fo blogllm adoption closure 2026-05-09 20:10:27 +02:00
Rene Fichtmueller
56ed88ac8c sync: record final detail queue closure 2026-05-09 18:25:56 +02:00
Rene Fichtmueller
ec40a96ae0 feat: add vendor detail verifiers 2026-05-09 18:22:09 +02:00