Rene Fichtmueller
7d005ba1f3
data: add Cisco 8000 accuracy migration (031)
...
Documents all Cisco 8000 research findings as idempotent SQL migration:
- 8201-32FH: Silicon One Q200 (not Q100), full description, features, use_cases
- 8608: modular chassis type, corrected description
- NEW: 8812 (12-slot), 8818 (18-slot) modular chassis
- NEW: 8800-LC-36FH (36×400G QSFP-DD), 8800-LC-48H (48×100G QSFP28) line cards
flexbox_notes covers:
- IOS XR EEPROM/PID check behavior (TX laser disabled on unknown PID)
- transceiver permit pid all (per-interface, NOT global)
- DCO license required per 100G for ZR/ZR+ coherent optics (all vendors)
- No RTU license for grey non-coherent optics
- TAC refusal + PFM alarm on override
- IOS XR upgrade risk for 3rd-party optics
- DOM monitoring may show incorrect values
- Flexoptix FlexBox Cisco Systems mode = native PID, no override needed
2026-04-09 09:09:23 +02:00
Rene Fichtmueller
cf75eee8ad
feat: linecard system support, Cisco 8000 accuracy, price anomaly detection
...
API/finder:
- Add modular chassis support: sibling linecards fetched when is_linecard=true
- Add chassis linecards when system_type=modular
- Extend switch response: system_type, is_linecard, chassis_model, slot_type,
flexbox_compat_mode, flexbox_notes, description, switching_capacity_tbps,
total_ports, category, lifecycle_status, features, use_cases, linecards[]
API/transceivers:
- Filter price_observations with COALESCE(is_anomalous, false) = false
(direct prices + comparable market prices)
Scraper/db:
- Add PRICE_BOUNDS map (per form-factor min/max USD sanity bounds)
- Add isPriceAnomalous() — marks DB price_observations as is_anomalous=true
- Add competitor_verified flag: set true when valid competitor price stored
- upsertPriceObservation: skip prices outside sanity bounds, set competitor_verified
Scraper/hash:
- contentHash() now accepts Record<string,unknown> | string (union type)
to support both structured objects and legacy string callers
Scrapers (skylane, tscom, wiitek):
- Fix contentHash() call signature: pass objects not JSON.stringify strings
- Fix wiitek: remove invalid 'name' param, fix t.id → transceiverId
Migrations:
- Add is_anomalous, competitor_verified, competitor_verified_at,
image_primary columns
- Recreate sync_fully_verified trigger to include competitor_verified
- Add is_linecard, chassis_model, system_type, slot_type,
flexbox_compat_mode, flexbox_notes to switches table
2026-04-09 09:06:22 +02:00
Rene Fichtmueller
240e7f46f2
feat(scraper): add SOCKS5 proxy rotation for fs-com, atgbics, gbics scrapers
...
Routes requests through CT130/131/132 proxy pool (192.168.178.77/76/74:1080)
when PROXY_URLS env var is set. Uses ProxyConfiguration from crawlee for
PlaywrightCrawler scrapers and socks-proxy-agent for fetch-based scrapers.
2026-04-08 08:17:49 +02:00
Rene Fichtmueller
772ce2074d
feat: add blog training articles 056-100 for fo-blog-v3 fine-tuning
...
45 expert articles covering: Cisco/Juniper/Arista optic compatibility mechanics,
100G/400G/800G optics selection, DWDM/ROADM/WSS architecture, fiber standards,
coherent pluggables, AI cluster optics, carrier timing, EEPROM programming,
market pricing 2026, hyperscale procurement, transceiver failure analysis, and more.
2026-04-07 08:59:16 +02:00
Rene Fichtmueller
12a19639fd
fix: seed script accepts category as fallback for missing type field
2026-04-07 01:24:15 +02:00
Rene Fichtmueller
0572ab5a71
feat: add blog training articles 041-055 for fo-blog-v2 fine-tuning
...
15 expert articles covering: CPO/silicon photonics 2026, 800G OSFP vs QSFP-DD,
400ZR/OpenZR+/ZR+ comparison, laser safety, OSNR/link budget, counterfeit detection,
DOM deep dive, 400G DR4/FR4/LR4, WDM primer, temp grades, spine-leaf strategy,
proactive replacement, OEM lock-in, OM3/4/5, lifecycle management.
2026-04-07 01:08:27 +02:00
Rene Fichtmueller
99fca6b531
feat(training): add blog-031 through blog-040 — 10 expert articles
...
Topics: CWDM4/PSM4, MSA compliance, DAC/AOC TCO, grey vs DWDM,
ESD damage, tunable DWDM, FEC deep-dive, CPO hype cycle,
CMIS 4.0, vendor evaluation. Ø 1,180 words each.
2026-04-06 18:15:46 +02:00
Rene Fichtmueller
51af249361
Merge remote-tracking branch 'github/main'
...
# Conflicts:
# packages/api/src/llm/fo-blog-pipeline.ts
# packages/api/src/routes/blog.ts
# packages/scraper/src/scheduler.ts
# packages/scraper/src/scrapers/fs-com.ts
# packages/scraper/src/scrapers/gbics.ts
2026-04-06 18:03:36 +02:00
Rene Fichtmueller
285a91b945
feat(training): add blog-016 through blog-030 — 15 expert training articles
...
Adds 15 Sonnet-quality blog articles for fo-blog-v1 fine-tuning:
tutorials, comparisons, tech deep-dives covering 400G/800G topics.
Also adds seed-blog-training-data.py script for learning_corpus import.
2026-04-06 17:59:14 +02:00
Rene Fichtmueller
bf06993b63
ui: show creation time (HH:MM) alongside date in blog list
2026-04-06 17:07:05 +02:00
Rene Fichtmueller
3928755c60
fix: correct verified badge, comparable pricing, and clickable product images
...
- Reset details_verified=false for 298 products where reach_label is empty (DB migration)
- Runtime check in dashboard: dVer requires non-empty reach_label regardless of DB flag
- comparable price query: treat reach_meters=0 same as NULL so 800G OSFP products
find FS.com equivalent prices (was blocked by reach_meters=0 != NULL shortcircuit)
- Product image area now fully clickable with vendor link overlay when product_page_url exists
- Clear wrong image for O.Czz8HG.z.R (was showing unrelated OSFP product image)
2026-04-06 10:24:39 +02:00
Rene Fichtmueller
8f060d0159
feat(training): add blog-014 new_product and blog-015 competitor_analysis
...
Completes training data coverage for all 8 blog types:
market_alert(2), comparison(1), technology_deep_dive(4), tutorial(3),
hype_cycle(1), buying_guide(1), migration_guide(1), new_product(1),
competitor_analysis(1) — 15 gold-standard articles total
2026-04-06 04:16:00 +02:00
Rene Fichtmueller
4acf293690
fix(llm): checkHealth uses key presence check, not live API call
...
Live Anthropic API call during health check causes 429 when the pipeline
is actively running, blocking all subsequent regenerate requests.
2026-04-06 04:07:21 +02:00
Rene Fichtmueller
f7bdee9583
feat: add 2 more gold-standard blog training articles (13 total)
...
- blog-012: technology_deep_dive — coherent vs direct-detect decision framework
- blog-013: market_alert — transceiver price cycle, when to buy
Training set now covers: market_alert(2), comparison(1), technology_deep_dive(4),
tutorial(3), hype_cycle(1), buying_guide(1), migration_guide(1) — 13 total
2026-04-06 03:09:55 +02:00
Rene Fichtmueller
de05bbbec8
docs: update training data README to reflect 11 articles
2026-04-06 02:55:34 +02:00
Rene Fichtmueller
b8e6a62c7b
feat: add 4 more gold-standard blog training articles for BlogLLM
...
Adding diverse topic coverage:
- blog-008: buying_guide — OEM vs compatible real cost numbers
- blog-009: migration_guide — 100G→400G what actually breaks
- blog-010: technology_deep_dive — QSFP-DD vs OSFP form factor reality
- blog-011: tutorial — transceiver procurement checklist
All follow FO rules: no markdown headers in body, no bullet lists,
one thesis, engineer voice, ~1000 words. Total training set: 11 articles.
2026-04-06 02:55:10 +02:00
Rene Fichtmueller
72033ff5c5
fix(blog): fix claudeQueue deadlock from recursive 429 retry
...
The generateClaude() function was recursively calling itself inside
enqueueClaude(), creating a circular Promise dependency that permanently
deadlocked the claudeQueue. Any 429 rate-limit response would poison
the queue, blocking all future Claude API calls until server restart.
Fixes:
- Split retries into claudeApiCall() which is called from enqueueClaude
(not re-entering the queue on retry = no circular dependency)
- Max 3 retries with increasing backoff (10s/30s/60s)
- Add resetClaudeQueue() exported function
- Add 15-minute auto-reset stall detection to enqueueClaude
- Expose resetClaudeQueue in POST /api/blog/llm/reset-queue endpoint
- Fix merge conflict markers in index.ts (duplicate scraperRouter import)
2026-04-06 02:51:28 +02:00
Rene Fichtmueller
55de4920b2
feat(sql): migrations 026+027 for price cleanup and FS.COM EUR fix
...
026: Remove invalid price observations (sub-manufacturing-cost), disable
optictransceiver.com (domain repurposed as plant shop), fix verification
function to accept low/medium/high data_confidence values
027: Clean up FS.COM USD→EUR converted prices, force re-scrape with
new de.fs.com EUR-primary scraper
2026-04-06 02:22:00 +02:00
Rene Fichtmueller
2e852e0a2f
fix(scrapers): replace bot User-Agents with Chrome UA + disable dead domain
...
- 16 commercial scrapers: replace TIP-Bot/1.0 with Chrome/120 UA
(GBICS confirmed returning 0 bytes for bot UA, Chrome UA returns 200KB)
- gbics.ts: fix User-Agent (was returning empty HTML, now returns products)
- optictransceiver.ts: disable — domain repurposed as plant shop (2026-04-06)
Alocasia Regal Shield is not a transceiver.
2026-04-06 02:17:50 +02:00
Rene Fichtmueller
80aa85961b
feat: add 7 gold-standard blog training articles for BlogLLM
...
Reference quality articles covering: 400G DR4 pricing, vendor lock-in,
silicon photonics, fiber plant readiness, 400ZR reality check,
DOM diagnostics, 800G readiness. All follow strict FO Blog Pipeline
rules — no markdown headers, no spec dumps, one thesis per article.
2026-04-06 01:58:05 +02:00
Rene Fichtmueller
dfe86fb347
fix(scraper): switch fs-com to de.fs.com for EUR prices as primary source
...
EUR prices scraped verbatim from de.fs.com — no conversion needed.
USD derivation (EUR→USD) happens downstream, not EUR←USD.
Fixes price discrepancy: TIP showed USD 999×0.92=EUR 866 vs real €948 on de.fs.com.
2026-04-06 01:24:47 +02:00
Rene Fichtmueller
e4e432d9fa
fix: parsePrice requires currency symbol + uses largest number to avoid misreads
...
Root cause of fake prices (e.g. 1.30 for 800G OSFP):
- parsePrice accepted any bare number without currency symbol
- Could misread stock counts, page numbers, or CSS values as prices
- Also picked the first number, not the main price
Fix:
- Require explicit currency symbol or decimal format (1234.56)
- Use the LARGEST number found in the price string
- Returns price=0 (rejected) when no valid price pattern found
2026-04-06 01:19:25 +02:00
Rene Fichtmueller
a35817c96d
fix: preserve user-provided title in blog generation + price floor validation
...
- blog/generate now uses caller title when provided; falls back to template
- Migration 027: hard price floor by speed class in verification function
(no medians, no estimates — only real prices above minimum thresholds)
- Deleted 474 obviously wrong price observations (shipping costs scraped as prices)
2026-04-06 01:14:37 +02:00
Rene Fichtmueller
df8d1e797c
fix: show price_verified_eur as fallback price + strict badge logic
...
- Price column now shows price_verified_eur (in EUR, dimmed) when street_price_usd is null
Fixes: FS.COM products showing dash while being marked fully verified
- Badge logic now requires visible price AND image_verified AND details_verified
No more badge when price displays as dash — all requirements must be visually present
2026-04-06 01:04:44 +02:00
Rene Fichtmueller
b6928265bf
fix: serialize Claude API calls via queue to prevent 429 rate-limit spam
...
Tier-1 Anthropic API has 40K TPM — with ~20K tokens per pipeline step,
concurrent calls immediately hit the limit. enqueueClaude() serializes
all generateClaude() calls so only one runs at a time, eliminating
the flood of 429-retry-429-retry loops.
2026-04-06 00:57:03 +02:00
Rene Fichtmueller
cf04549b1b
feat: add Anthropic Claude provider to blog LLM client
...
- Auto-routes to Claude API when BLOG_LLM_PROVIDER=anthropic + ANTHROPIC_API_KEY set
- Fallback to Ollama queue when key not present
- Add rate-limit retry (429 → 10s backoff) for Claude API
- Add STEP_TECHNICAL_SANITY, STEP_SELF_HEAL, STEP_TITLE_CONTRACT_CHECK prompts
- Fix STEP_LINKEDIN_POST angle-specific hooks, remove Gold Reference repetition
2026-04-06 00:21:48 +02:00
Rene Fichtmueller
c43e1f881a
fix(blog): hard story blacklist in STEP4 + LinkedIn — ban 2AM/dirty connector/lab-vs-prod stories
...
10 specific story patterns banned directly in draft prompt.
LinkedIn banned hooks: 'Everything looks fine', 'CRC creeping', 'Same optics same setup'.
Title Contract now injected into STEP4 as binding constraint.
2026-04-05 23:55:56 +02:00
Rene Fichtmueller
80435f8e07
feat(blog): Post to Ghost + LinkedIn buttons in dashboard
...
- 'Post on blog.fichtmueller.org' → publishes via Ghost Admin API
- 'Post on LinkedIn' → modal with text + copy + open LinkedIn
- Ghost integration: TIP Blog Engine (JWT auth, mobiledoc format)
2026-04-05 23:33:58 +02:00
Rene Fichtmueller
3913372f10
feat(blog): Title Contract + Technical Sanity Check + Self-Heal + angle-aware LinkedIn generator
...
Pipeline now has 21 steps:
- STEP0: Title Contract binds LLM to headline promise
- STEP19: Technical Sanity Check (optical engineering accuracy)
- STEP20: Self-Heal (auto-fix technical errors preserving tone)
- STEP21: Title Contract Verification (final gate check)
- LinkedIn generator is now angle-aware (no more default Physical Layer hook)
2026-04-05 23:11:16 +02:00
Rene Fichtmueller
8e0eda6c41
fix(blog): anti-repetition engine — 6 angle types, forbidden structures, existing article context injection
2026-04-05 22:47:15 +02:00
Rene Fichtmueller
438225cf7c
fix(blog): raise word target to 1200-1600, fix power-budget false positive in validateArticle
2026-04-05 20:49:22 +02:00
Rene Fichtmueller
1434037d29
fix(scraper): use false instead of null for image_verified on insert
2026-04-05 12:15:10 +02:00
Rene Fichtmueller
308cf8e774
feat(dashboard): add data verification status section to overview tab
2026-04-05 12:11:23 +02:00
Rene Fichtmueller
7c7d9f7b51
fix: make /api/hot-topics public — dashboard fetch has no auth token
2026-04-05 12:07:44 +02:00
Rene Fichtmueller
e6d042f827
fix: resolve merge conflict in index.ts + add untracked blog-sll, news, sql migration
2026-04-05 11:51:07 +02:00
root
5d8768b43b
fix: mount blogSllRouter + scraperRouter — SLL and Crawler Intelligence routes were missing
2026-04-05 09:50:30 +00:00
Rene Fichtmueller
8cc19011f4
feat(scraper): all pricing scrapers to 2h 24/7 — full competitor coverage, no gaps
2026-04-05 01:32:08 +02:00
Rene Fichtmueller
44244a22a1
feat: 4th verification criterion (Competitor) + scraper frequency FS/10Gtek/ProLabs to 2h
2026-04-05 01:28:46 +02:00
Rene Fichtmueller
e4dfd2a2db
feat(blog): AEM/APM pipeline steps + SLL context builder + LinkedIn v2 prompts
2026-04-05 01:26:09 +02:00
Rene Fichtmueller
95a8aa8552
fix: include linkedin_post in GET /api/blog response for SLL matching
2026-04-05 01:24:52 +02:00
Rene Fichtmueller
931588fffd
fix(verification): 100% Verified Badge war dramatisch zu großzügig
...
KERNPROBLEME BEHOBEN:
1. ATGBICS part_number = URL slug statt echte OEM-Nummer
extractOemPartNumber() entfernt -r-compatible-transceiver-* Suffix
+ trailing Vendor-Namen (nokia, cisco, juniper, ...)
Ergebnis: 3he16564aa-nokia-r-compatible-transceiver-... → 3HE16564AA
2. reach_label = '' (leer) wurde als details_verified akzeptiert
IS NOT NULL erlaubt leere Strings → Fix: AND reach_label != ''
3. details_verified = true trotz garbled part_number
Neue Kriterien: NOT ILIKE '%-compatible-transceiver%'
NOT ILIKE '%-r-compatible%'
4. data_confidence Werte falsch in Funktion ('scraped_unverified' etc)
Echte Werte: low/medium/high/garbage → NOT IN ('garbage','unknown')
ERGEBNIS nach recompute_all_verification():
fully_verified: 3.654 → 581 (Badge war 6x übertrieben)
details_verified: inflated → 1.075 (korrekt)
ATGBICS Scraper:
- extractOemPartNumber() für collection und product detail pages
- detectReach() jetzt auch auf URL-slug (120km im slug → reach_label)
Price Anomaly Detection:
- API: price_anomaly field wenn max/min ratio ≥ 10x
- Dashboard: ⚠ Preisanomalie Banner mit Ratio + EUR Range
SQL 025: Part number cleanup (30 records), reach from slug (12 records)
2026-04-04 15:41:57 +02:00
Rene Fichtmueller
1e789f67eb
fix(scrapers): Flexoptix Catalog zeigt 0 records statt 963
...
SCRAPERS list used 'flexoptix-catalog' as DB lookup key but vendors.slug
is 'flexoptix' — no match → 0 records shown.
Fix: added dbSlug override field to SCRAPERS entries; lookup now uses
dbSlug || name so flexoptix-catalog/vendors/supported all map to
the correct 'flexoptix' slug in sourceMap.
2026-04-04 15:26:04 +02:00
Rene Fichtmueller
fea0b0fb66
feat: blog engine v5 — Auto-Kill Layer, 16-step pipeline, longer content
...
Upgrades FO Blog Pipeline from 14 to 16 steps:
- NEW Step 8d: Auto-Kill Layer v1.0 (10 systematic categories A-J)
- NEW Step 15: Auto-Kill Scoring (cleanliness, narrative, non-AI, relevance)
- Updated banned phrases from Gold-standard editorial feedback
- Soft Delete List for conditional phrases
- Auto-Kill categories: spec blocks, formulas, section leakage,
generic transitions, repeated concepts, SKU mentions, false authority,
over-explained basics, whitepaper tone, fake precision
Content length changes per user feedback:
- Blog target: 1,200-2,000 words (was 700-1,000) — thorough and detailed
- LinkedIn target: 2,000-2,800 chars (was 350-600) — use maximum length
- Reduction pass: 25-30% cut (was 15-25%) — remove weak, keep depth
2026-04-04 11:02:45 +02:00
Rene Fichtmueller
ede4f5b966
feat: blog engine v3 — 8-stage pipeline with Auto-Kill Layer
...
Complete rewrite of blog prompts and pipeline based on editorial
Gold-standard feedback. Replaces 3-pass system with 8-stage pipeline:
1. Master generation (narrative voice, no spec dumps)
2. Narrative Control (kill visible structure, enforce flow)
3. Auto-Kill Layer (remove AI phrases, spec residue, repetition)
4. Reduction Engine (cut 40% — keep strongest ideas only)
5. Depth pass (add specifics where vague, no spec dumps)
6. Quality Control (hard delete list validation)
7. Procurement layer (optional, sales audience)
8. LinkedIn post generation (new)
Key changes:
- System prompt rewritten with Hard Delete List (29 banned phrases)
- Soft Delete List for conditional phrases
- Auto-Kill categories A-J (spec blocks, formulas, whitepaper tone, etc.)
- Master prompts enforce continuous narrative, no section headings
- Word count targets reduced (800-1200 instead of 1500+)
- Scoring pass added (cleanliness, narrative, non-AI feel, relevance)
- LinkedIn companion post auto-generated
- Context data injection reduced (fewer items, no dump instructions)
2026-04-04 10:52:31 +02:00
Rene Fichtmueller
c509251109
feat(blog): Spec dump hard fail + Gold Standards 6 + LinkedIn v2
...
- System prompt: SPEC DUMP ABSOLUTE HARD FAIL block (before FORMAT rules)
TX/RX tables, multi-optic comparison blocks, repeated sections = hard fail
Behavioral prose rule: "what actually happens" not "what the spec says"
- STEP9 QA: check 12a SPEC DUMP — removes datasheet blocks, flags
duplicate sections (e.g. "fiber types" twice), spec-heavy intros
- Gold Standard 6: 400G/800G deep dive corrected (8.8→10/10)
zero spec tables, pure behavioral narrative, 3 core ideas max,
ending is reframe not checklist
- LinkedIn Gold Example 2: sharper short format (346 chars vs 700)
reframe hook, short beats without bullet markers, no emoji, 4 hashtags
- STEP_LINKEDIN_POST: rewritten with new gold format
optimal 350-600 chars, beat rhythm, no bullet markers, gold example inline
- WRONG PATTERNS: +7 new entries (spec dump, duplicate section,
LinkedIn bullet list, LinkedIn "excited to share" hook, LinkedIn >800 chars)
2026-04-04 09:32:01 +02:00
Rene Fichtmueller
4f631fc61e
feat(blog): Reduction Engine v1.0 + LaTeX/connector hard fails
...
- Replace STEP8b_REDUCTION with 5-pass Reduction Engine:
Pass 1: Repetition Kill (one concept, one home)
Pass 2: Tech Prune (LaTeX hard delete, SKU removal, formula prose replacement)
Pass 3: Flow Rebuild (close gaps after cuts, no new content)
Pass 4: Weight Correction (title/content alignment throughout)
Pass 5: Humanization (rhythm variation, hedge removal, punch ending)
Target: 700-1000 words (600-1300 range, warnings outside)
- System prompt + STEP9 QA: add hard fails for
LaTeX formulas (\[...\], \frac{}, \text{} etc) — destroys blog flow
DR4 connector error (DR4=MPO-12, not LC duplex; FR4=LC duplex)
Title/content mismatch (title topic must be the spine, not just the intro)
- Gold Standard 5: market alert / pricing article template
(correct title alignment, no LaTeX, DR4=MPO-12, ending on topic)
- WRONG PATTERNS extended with 4 new entries covering above failures
- blog.ts: step log messages updated to 11-14/14; word count
output shows % reduction and range warning (>1300 or <600)
2026-04-04 08:57:21 +02:00
Rene Fichtmueller
d6adb5600f
ui: blog detail — separate blog article + linkedin post sections with copy buttons and char count badge
2026-04-04 08:35:33 +02:00
Rene Fichtmueller
3431ccbebc
chore: changelog — blog engine v5 + linkedin post 2026-04-04
2026-04-04 08:30:54 +02:00
Rene Fichtmueller
1e19365e96
feat: blog engine v5 — narrative control + linkedin post + min words fix
...
- STEP4b_NARRATIVE_CONTROL: new pipeline step after draft; detects wrong
narrative (technology blamed instead of processes), applies anti-FUD filter,
reality reframe ("this becomes a problem when..."), Flexoptix voice check
- System prompt: NARRATIVE CONTROL RULE added as absolute rule #1
- Gold Standard 4: corrected "compatible vs OEM" article added as reference
- Minimum words: STEP4 raised from 1500 to 2500 words (final output was 750)
- Reduction pass: 25-35% → 15-25%, target 1500-2000 words final
- STEP_LINKEDIN_POST: generates LinkedIn post ≤2800 chars (hard limit 3000);
stores in blog_drafts.linkedin_post + linkedin_char_count column
- Pipeline now 14 steps: v5-narrative-control
- Migration 024: linkedin_post + linkedin_char_count columns in blog_drafts
2026-04-04 08:30:27 +02:00
Rene Fichtmueller
be9209ffbd
chore: changelog — proxy network geo/uptime fixes 2026-04-04
2026-04-04 08:15:58 +02:00