transceiver-db

Author	SHA1	Message	Date
Rene Fichtmueller	633478595b	fix: seed script accepts category as fallback for missing type field	2026-04-07 01:24:15 +02:00
Rene Fichtmueller	f5c714d58c	feat: add blog training articles 041-055 for fo-blog-v2 fine-tuning 15 expert articles covering: CPO/silicon photonics 2026, 800G OSFP vs QSFP-DD, 400ZR/OpenZR+/ZR+ comparison, laser safety, OSNR/link budget, counterfeit detection, DOM deep dive, 400G DR4/FR4/LR4, WDM primer, temp grades, spine-leaf strategy, proactive replacement, OEM lock-in, OM3/4/5, lifecycle management.	2026-04-07 01:08:27 +02:00
Rene Fichtmueller	d617524f10	feat(training): add blog-031 through blog-040 — 10 expert articles Topics: CWDM4/PSM4, MSA compliance, DAC/AOC TCO, grey vs DWDM, ESD damage, tunable DWDM, FEC deep-dive, CPO hype cycle, CMIS 4.0, vendor evaluation. Ø 1,180 words each.	2026-04-06 18:15:46 +02:00
Rene Fichtmueller	61685f3959	Merge remote-tracking branch 'github/main' # Conflicts: # packages/api/src/llm/fo-blog-pipeline.ts # packages/api/src/routes/blog.ts # packages/scraper/src/scheduler.ts # packages/scraper/src/scrapers/fs-com.ts # packages/scraper/src/scrapers/gbics.ts	2026-04-06 18:03:36 +02:00
Rene Fichtmueller	e55c0ad55f	feat(training): add blog-016 through blog-030 — 15 expert training articles Adds 15 Sonnet-quality blog articles for fo-blog-v1 fine-tuning: tutorials, comparisons, tech deep-dives covering 400G/800G topics. Also adds seed-blog-training-data.py script for learning_corpus import.	2026-04-06 17:59:14 +02:00
Rene Fichtmueller	9744be5326	ui: show creation time (HH:MM) alongside date in blog list	2026-04-06 17:07:05 +02:00
Rene Fichtmueller	a99b4aab78	fix: correct verified badge, comparable pricing, and clickable product images - Reset details_verified=false for 298 products where reach_label is empty (DB migration) - Runtime check in dashboard: dVer requires non-empty reach_label regardless of DB flag - comparable price query: treat reach_meters=0 same as NULL so 800G OSFP products find FS.com equivalent prices (was blocked by reach_meters=0 != NULL shortcircuit) - Product image area now fully clickable with vendor link overlay when product_page_url exists - Clear wrong image for O.Czz8HG.z.R (was showing unrelated OSFP product image)	2026-04-06 10:24:39 +02:00
Rene Fichtmueller	b9301d890f	feat(training): add blog-014 new_product and blog-015 competitor_analysis Completes training data coverage for all 8 blog types: market_alert(2), comparison(1), technology_deep_dive(4), tutorial(3), hype_cycle(1), buying_guide(1), migration_guide(1), new_product(1), competitor_analysis(1) — 15 gold-standard articles total	2026-04-06 04:16:00 +02:00
Rene Fichtmueller	6e383996e3	fix(llm): checkHealth uses key presence check, not live API call Live Anthropic API call during health check causes 429 when the pipeline is actively running, blocking all subsequent regenerate requests.	2026-04-06 04:07:21 +02:00
Rene Fichtmueller	7533202723	feat: add 2 more gold-standard blog training articles (13 total) - blog-012: technology_deep_dive — coherent vs direct-detect decision framework - blog-013: market_alert — transceiver price cycle, when to buy Training set now covers: market_alert(2), comparison(1), technology_deep_dive(4), tutorial(3), hype_cycle(1), buying_guide(1), migration_guide(1) — 13 total	2026-04-06 03:09:55 +02:00
Rene Fichtmueller	cbf449da76	docs: update training data README to reflect 11 articles	2026-04-06 02:55:34 +02:00
Rene Fichtmueller	070477d67f	feat: add 4 more gold-standard blog training articles for BlogLLM Adding diverse topic coverage: - blog-008: buying_guide — OEM vs compatible real cost numbers - blog-009: migration_guide — 100G→400G what actually breaks - blog-010: technology_deep_dive — QSFP-DD vs OSFP form factor reality - blog-011: tutorial — transceiver procurement checklist All follow FO rules: no markdown headers in body, no bullet lists, one thesis, engineer voice, ~1000 words. Total training set: 11 articles.	2026-04-06 02:55:10 +02:00
Rene Fichtmueller	4989c4affd	fix(blog): fix claudeQueue deadlock from recursive 429 retry The generateClaude() function was recursively calling itself inside enqueueClaude(), creating a circular Promise dependency that permanently deadlocked the claudeQueue. Any 429 rate-limit response would poison the queue, blocking all future Claude API calls until server restart. Fixes: - Split retries into claudeApiCall() which is called from enqueueClaude (not re-entering the queue on retry = no circular dependency) - Max 3 retries with increasing backoff (10s/30s/60s) - Add resetClaudeQueue() exported function - Add 15-minute auto-reset stall detection to enqueueClaude - Expose resetClaudeQueue in POST /api/blog/llm/reset-queue endpoint - Fix merge conflict markers in index.ts (duplicate scraperRouter import)	2026-04-06 02:51:28 +02:00
Rene Fichtmueller	6fb9b6eb4f	feat(sql): migrations 026+027 for price cleanup and FS.COM EUR fix 026: Remove invalid price observations (sub-manufacturing-cost), disable optictransceiver.com (domain repurposed as plant shop), fix verification function to accept low/medium/high data_confidence values 027: Clean up FS.COM USD→EUR converted prices, force re-scrape with new de.fs.com EUR-primary scraper	2026-04-06 02:22:00 +02:00
Rene Fichtmueller	ba24d33858	fix(scrapers): replace bot User-Agents with Chrome UA + disable dead domain - 16 commercial scrapers: replace TIP-Bot/1.0 with Chrome/120 UA (GBICS confirmed returning 0 bytes for bot UA, Chrome UA returns 200KB) - gbics.ts: fix User-Agent (was returning empty HTML, now returns products) - optictransceiver.ts: disable — domain repurposed as plant shop (2026-04-06) Alocasia Regal Shield is not a transceiver.	2026-04-06 02:17:50 +02:00
Rene Fichtmueller	9f0ba2069c	feat: add 7 gold-standard blog training articles for BlogLLM Reference quality articles covering: 400G DR4 pricing, vendor lock-in, silicon photonics, fiber plant readiness, 400ZR reality check, DOM diagnostics, 800G readiness. All follow strict FO Blog Pipeline rules — no markdown headers, no spec dumps, one thesis per article.	2026-04-06 01:58:05 +02:00
Rene Fichtmueller	75a5b7318a	fix(scraper): switch fs-com to de.fs.com for EUR prices as primary source EUR prices scraped verbatim from de.fs.com — no conversion needed. USD derivation (EUR→USD) happens downstream, not EUR←USD. Fixes price discrepancy: TIP showed USD 999×0.92=EUR 866 vs real €948 on de.fs.com.	2026-04-06 01:24:47 +02:00
Rene Fichtmueller	e0db86252b	fix: parsePrice requires currency symbol + uses largest number to avoid misreads Root cause of fake prices (e.g. 1.30 for 800G OSFP): - parsePrice accepted any bare number without currency symbol - Could misread stock counts, page numbers, or CSS values as prices - Also picked the first number, not the main price Fix: - Require explicit currency symbol or decimal format (1234.56) - Use the LARGEST number found in the price string - Returns price=0 (rejected) when no valid price pattern found	2026-04-06 01:19:25 +02:00
Rene Fichtmueller	1ac03bae0a	fix: preserve user-provided title in blog generation + price floor validation - blog/generate now uses caller title when provided; falls back to template - Migration 027: hard price floor by speed class in verification function (no medians, no estimates — only real prices above minimum thresholds) - Deleted 474 obviously wrong price observations (shipping costs scraped as prices)	2026-04-06 01:14:37 +02:00
Rene Fichtmueller	a8faf3798b	fix: show price_verified_eur as fallback price + strict badge logic - Price column now shows price_verified_eur (in EUR, dimmed) when street_price_usd is null Fixes: FS.COM products showing dash while being marked fully verified - Badge logic now requires visible price AND image_verified AND details_verified No more badge when price displays as dash — all requirements must be visually present	2026-04-06 01:04:44 +02:00
Rene Fichtmueller	4e813024f1	fix: serialize Claude API calls via queue to prevent 429 rate-limit spam Tier-1 Anthropic API has 40K TPM — with ~20K tokens per pipeline step, concurrent calls immediately hit the limit. enqueueClaude() serializes all generateClaude() calls so only one runs at a time, eliminating the flood of 429-retry-429-retry loops.	2026-04-06 00:57:03 +02:00
Rene Fichtmueller	b2f3a4c450	feat: add Anthropic Claude provider to blog LLM client - Auto-routes to Claude API when BLOG_LLM_PROVIDER=anthropic + ANTHROPIC_API_KEY set - Fallback to Ollama queue when key not present - Add rate-limit retry (429 → 10s backoff) for Claude API - Add STEP_TECHNICAL_SANITY, STEP_SELF_HEAL, STEP_TITLE_CONTRACT_CHECK prompts - Fix STEP_LINKEDIN_POST angle-specific hooks, remove Gold Reference repetition	2026-04-06 00:21:48 +02:00
Rene Fichtmueller	b4d9bfc9d1	fix(blog): hard story blacklist in STEP4 + LinkedIn — ban 2AM/dirty connector/lab-vs-prod stories 10 specific story patterns banned directly in draft prompt. LinkedIn banned hooks: 'Everything looks fine', 'CRC creeping', 'Same optics same setup'. Title Contract now injected into STEP4 as binding constraint.	2026-04-05 23:55:56 +02:00
Rene Fichtmueller	65159bd57d	feat(blog): Post to Ghost + LinkedIn buttons in dashboard - 'Post on blog.fichtmueller.org' → publishes via Ghost Admin API - 'Post on LinkedIn' → modal with text + copy + open LinkedIn - Ghost integration: TIP Blog Engine (JWT auth, mobiledoc format)	2026-04-05 23:33:58 +02:00
Rene Fichtmueller	6c818c79b5	feat(blog): Title Contract + Technical Sanity Check + Self-Heal + angle-aware LinkedIn generator Pipeline now has 21 steps: - STEP0: Title Contract binds LLM to headline promise - STEP19: Technical Sanity Check (optical engineering accuracy) - STEP20: Self-Heal (auto-fix technical errors preserving tone) - STEP21: Title Contract Verification (final gate check) - LinkedIn generator is now angle-aware (no more default Physical Layer hook)	2026-04-05 23:11:16 +02:00
Rene Fichtmueller	7c8f545c18	fix(blog): anti-repetition engine — 6 angle types, forbidden structures, existing article context injection	2026-04-05 22:47:15 +02:00
Rene Fichtmueller	957632c228	fix(blog): raise word target to 1200-1600, fix power-budget false positive in validateArticle	2026-04-05 20:49:22 +02:00
Rene Fichtmueller	c4b1a20992	fix(scraper): use false instead of null for image_verified on insert	2026-04-05 12:15:10 +02:00
Rene Fichtmueller	d3780ef4fc	feat(dashboard): add data verification status section to overview tab	2026-04-05 12:11:23 +02:00
Rene Fichtmueller	a1223e8967	fix: make /api/hot-topics public — dashboard fetch has no auth token	2026-04-05 12:07:44 +02:00
Rene Fichtmueller	6d7b067ca9	fix: resolve merge conflict in index.ts + add untracked blog-sll, news, sql migration	2026-04-05 11:51:07 +02:00
root	161f045bc7	fix: mount blogSllRouter + scraperRouter — SLL and Crawler Intelligence routes were missing	2026-04-05 09:50:30 +00:00
Rene Fichtmueller	1e754625c5	feat(scraper): all pricing scrapers to 2h 24/7 — full competitor coverage, no gaps	2026-04-05 01:32:08 +02:00
Rene Fichtmueller	6d865cabb9	feat: 4th verification criterion (Competitor) + scraper frequency FS/10Gtek/ProLabs to 2h	2026-04-05 01:28:46 +02:00
Rene Fichtmueller	e496a91dd5	feat(blog): AEM/APM pipeline steps + SLL context builder + LinkedIn v2 prompts	2026-04-05 01:26:09 +02:00
Rene Fichtmueller	15f3ff5bef	fix: include linkedin_post in GET /api/blog response for SLL matching	2026-04-05 01:24:52 +02:00
Rene Fichtmueller	d9f5fc253f	fix(verification): 100% Verified Badge war dramatisch zu großzügig KERNPROBLEME BEHOBEN: 1. ATGBICS part_number = URL slug statt echte OEM-Nummer extractOemPartNumber() entfernt -r-compatible-transceiver-* Suffix + trailing Vendor-Namen (nokia, cisco, juniper, ...) Ergebnis: 3he16564aa-nokia-r-compatible-transceiver-... → 3HE16564AA 2. reach_label = '' (leer) wurde als details_verified akzeptiert IS NOT NULL erlaubt leere Strings → Fix: AND reach_label != '' 3. details_verified = true trotz garbled part_number Neue Kriterien: NOT ILIKE '%-compatible-transceiver%' NOT ILIKE '%-r-compatible%' 4. data_confidence Werte falsch in Funktion ('scraped_unverified' etc) Echte Werte: low/medium/high/garbage → NOT IN ('garbage','unknown') ERGEBNIS nach recompute_all_verification(): fully_verified: 3.654 → 581 (Badge war 6x übertrieben) details_verified: inflated → 1.075 (korrekt) ATGBICS Scraper: - extractOemPartNumber() für collection und product detail pages - detectReach() jetzt auch auf URL-slug (120km im slug → reach_label) Price Anomaly Detection: - API: price_anomaly field wenn max/min ratio ≥ 10x - Dashboard: ⚠ Preisanomalie Banner mit Ratio + EUR Range SQL 025: Part number cleanup (30 records), reach from slug (12 records)	2026-04-04 15:41:57 +02:00
Rene Fichtmueller	a93fc8679e	fix(scrapers): Flexoptix Catalog zeigt 0 records statt 963 SCRAPERS list used 'flexoptix-catalog' as DB lookup key but vendors.slug is 'flexoptix' — no match → 0 records shown. Fix: added dbSlug override field to SCRAPERS entries; lookup now uses dbSlug \|\| name so flexoptix-catalog/vendors/supported all map to the correct 'flexoptix' slug in sourceMap.	2026-04-04 15:26:04 +02:00
Rene Fichtmueller	4a53f3c45d	feat: blog engine v5 — Auto-Kill Layer, 16-step pipeline, longer content Upgrades FO Blog Pipeline from 14 to 16 steps: - NEW Step 8d: Auto-Kill Layer v1.0 (10 systematic categories A-J) - NEW Step 15: Auto-Kill Scoring (cleanliness, narrative, non-AI, relevance) - Updated banned phrases from Gold-standard editorial feedback - Soft Delete List for conditional phrases - Auto-Kill categories: spec blocks, formulas, section leakage, generic transitions, repeated concepts, SKU mentions, false authority, over-explained basics, whitepaper tone, fake precision Content length changes per user feedback: - Blog target: 1,200-2,000 words (was 700-1,000) — thorough and detailed - LinkedIn target: 2,000-2,800 chars (was 350-600) — use maximum length - Reduction pass: 25-30% cut (was 15-25%) — remove weak, keep depth	2026-04-04 11:02:45 +02:00
Rene Fichtmueller	4a501d4461	feat: blog engine v3 — 8-stage pipeline with Auto-Kill Layer Complete rewrite of blog prompts and pipeline based on editorial Gold-standard feedback. Replaces 3-pass system with 8-stage pipeline: 1. Master generation (narrative voice, no spec dumps) 2. Narrative Control (kill visible structure, enforce flow) 3. Auto-Kill Layer (remove AI phrases, spec residue, repetition) 4. Reduction Engine (cut 40% — keep strongest ideas only) 5. Depth pass (add specifics where vague, no spec dumps) 6. Quality Control (hard delete list validation) 7. Procurement layer (optional, sales audience) 8. LinkedIn post generation (new) Key changes: - System prompt rewritten with Hard Delete List (29 banned phrases) - Soft Delete List for conditional phrases - Auto-Kill categories A-J (spec blocks, formulas, whitepaper tone, etc.) - Master prompts enforce continuous narrative, no section headings - Word count targets reduced (800-1200 instead of 1500+) - Scoring pass added (cleanliness, narrative, non-AI feel, relevance) - LinkedIn companion post auto-generated - Context data injection reduced (fewer items, no dump instructions)	2026-04-04 10:52:31 +02:00
Rene Fichtmueller	4a21967f41	feat(blog): Spec dump hard fail + Gold Standards 6 + LinkedIn v2 - System prompt: SPEC DUMP ABSOLUTE HARD FAIL block (before FORMAT rules) TX/RX tables, multi-optic comparison blocks, repeated sections = hard fail Behavioral prose rule: "what actually happens" not "what the spec says" - STEP9 QA: check 12a SPEC DUMP — removes datasheet blocks, flags duplicate sections (e.g. "fiber types" twice), spec-heavy intros - Gold Standard 6: 400G/800G deep dive corrected (8.8→10/10) zero spec tables, pure behavioral narrative, 3 core ideas max, ending is reframe not checklist - LinkedIn Gold Example 2: sharper short format (346 chars vs 700) reframe hook, short beats without bullet markers, no emoji, 4 hashtags - STEP_LINKEDIN_POST: rewritten with new gold format optimal 350-600 chars, beat rhythm, no bullet markers, gold example inline - WRONG PATTERNS: +7 new entries (spec dump, duplicate section, LinkedIn bullet list, LinkedIn "excited to share" hook, LinkedIn >800 chars)	2026-04-04 09:32:01 +02:00
Rene Fichtmueller	4eddbfbc7c	feat(blog): Reduction Engine v1.0 + LaTeX/connector hard fails - Replace STEP8b_REDUCTION with 5-pass Reduction Engine: Pass 1: Repetition Kill (one concept, one home) Pass 2: Tech Prune (LaTeX hard delete, SKU removal, formula prose replacement) Pass 3: Flow Rebuild (close gaps after cuts, no new content) Pass 4: Weight Correction (title/content alignment throughout) Pass 5: Humanization (rhythm variation, hedge removal, punch ending) Target: 700-1000 words (600-1300 range, warnings outside) - System prompt + STEP9 QA: add hard fails for LaTeX formulas (\[...\], \frac{}, \text{} etc) — destroys blog flow DR4 connector error (DR4=MPO-12, not LC duplex; FR4=LC duplex) Title/content mismatch (title topic must be the spine, not just the intro) - Gold Standard 5: market alert / pricing article template (correct title alignment, no LaTeX, DR4=MPO-12, ending on topic) - WRONG PATTERNS extended with 4 new entries covering above failures - blog.ts: step log messages updated to 11-14/14; word count output shows % reduction and range warning (>1300 or <600)	2026-04-04 08:57:21 +02:00
Rene Fichtmueller	b180fe57ee	ui: blog detail — separate blog article + linkedin post sections with copy buttons and char count badge	2026-04-04 08:35:33 +02:00
Rene Fichtmueller	15d02108aa	chore: changelog — blog engine v5 + linkedin post 2026-04-04	2026-04-04 08:30:54 +02:00
Rene Fichtmueller	7db9fad108	feat: blog engine v5 — narrative control + linkedin post + min words fix - STEP4b_NARRATIVE_CONTROL: new pipeline step after draft; detects wrong narrative (technology blamed instead of processes), applies anti-FUD filter, reality reframe ("this becomes a problem when..."), Flexoptix voice check - System prompt: NARRATIVE CONTROL RULE added as absolute rule #1 - Gold Standard 4: corrected "compatible vs OEM" article added as reference - Minimum words: STEP4 raised from 1500 to 2500 words (final output was 750) - Reduction pass: 25-35% → 15-25%, target 1500-2000 words final - STEP_LINKEDIN_POST: generates LinkedIn post ≤2800 chars (hard limit 3000); stores in blog_drafts.linkedin_post + linkedin_char_count column - Pipeline now 14 steps: v5-narrative-control - Migration 024: linkedin_post + linkedin_char_count columns in blog_drafts	2026-04-04 08:30:27 +02:00
Rene Fichtmueller	74dcc14e1e	chore: changelog — proxy network geo/uptime fixes 2026-04-04	2026-04-04 08:15:58 +02:00
Rene Fichtmueller	5091a7b75f	feat: proxy network — geo-lookup, uptime tracking, dedup fix - IP geo-lookup via ip-api.com on register/heartbeat (country_code, city) - heartbeat_count column + uptime_pct computation on every heartbeat - Deduplication: register returns existing token for same IP+port - Heartbeat no longer overwrites registered IP (prevents IPv6 churn conflicts) - Migration 023: heartbeat_count column + backfill existing nodes	2026-04-04 08:15:32 +02:00
Rene Fichtmueller	5d53d3af6f	docs: update changelog 2026-04-03/04 — scraper fixes, blog engine v4, proxy network, pg-boss fix	2026-04-04 07:58:29 +02:00
Rene Fichtmueller	5c9cf0e9b5	feat: blog engine v4 (reduction+style-lock passes) + flexoptix scraper fixes Blog engine (fo-blog-pipeline.ts): - Add STEP8b_REDUCTION: cuts article 25-35%, removes repeated concepts - Add STEP8c_STYLE_LOCK: enforces tone consistency, fixes scope/OPM confusion, removes inline SKUs from article flow - Add Gold Standard 3 to calibration (Style B troubleshooting example 2026-04-04) - Pipeline now 12 steps (was 10), version bumped to v4-reduction-stylelock blog.ts: - Wire STEP8b and STEP8c into pipeline between Kill-AI-Tone and QA Check - Update progress tracking to 12 total steps - Update pipeline_version to 'v4-reduction-stylelock' flexoptix-catalog.ts: - Fix contentHash call: pass object directly, not JSON.stringify(object) db.ts: - price_verified=true set in content_hash early-return path (no new observation) - image_verified=true auto-set in findOrCreateScrapedTransceiver on INSERT/UPDATE	2026-04-04 07:50:01 +02:00
Rene Fichtmueller	441d9721c9	fix: flexoptix catalog scraper — 1G SFP coverage + SKU suffix + pagination - Add 1G SFP search queries ("1G SFP", "SFP LX", "SFP SX", "SFP ZX") — were completely missing - Strip vendor-compat suffix from SKU (S.1303.10.DG:Sx → S.1303.10.DG) to match existing records - Remove 200-product cap, use full API pagination (page >= 50 limit only) - Result: FLEXOPTIX 1G SFP coverage 50% → 97%, overall price coverage 62% → 88%	2026-04-04 07:26:13 +02:00

1 2 3 4 5 ...

254 Commits