Rene Fichtmueller 9f0ba2069c feat: add 7 gold-standard blog training articles for BlogLLM
Reference quality articles covering: 400G DR4 pricing, vendor lock-in,
silicon photonics, fiber plant readiness, 400ZR reality check,
DOM diagnostics, 800G readiness. All follow strict FO Blog Pipeline
rules — no markdown headers, no spec dumps, one thesis per article.
2026-04-06 01:58:05 +02:00

46 lines
2.3 KiB
Markdown

# BlogLLM Training Data — Flexoptix Reference Articles
Gold-standard blog posts generated by Claude Sonnet (claude-sonnet-4-20250514) following the strict FO Blog Pipeline rules. These serve as reference examples for fine-tuning and training the BlogLLM.
## Articles
| File | Title | Type | Score |
|------|-------|------|-------|
| blog-001-400g-dr4-price-war.md | 400G DR4 Prices Are Moving... | market_alert | 9/10 |
| blog-002-vendor-lock-in-optics.md | The Hidden Tax in Your Transceiver Budget | comparison | 9/10 |
| blog-003-silicon-photonics.md | Silicon Photonics Is Shipping... | technology_deep_dive | 9/10 |
| blog-004-400g-migration-fiber-plant.md | Your 100G Fiber Plant Is Not Ready for 400G | tutorial | 9/10 |
| blog-005-coherent-400zr-reality.md | 400ZR Is Not What the Vendor Presentations Said | technology_deep_dive | 9/10 |
| blog-006-dom-diagnostics.md | Reading DOM Data Correctly | tutorial | 9/10 |
| blog-007-800g-readiness.md | 800G Is Shipping. Your Infrastructure Probably Isn't Ready. | hype_cycle | 9/10 |
## Quality Rules Met (per article)
All articles were generated under strict constraints:
- No markdown headers (##, ###) anywhere in body
- No bullet lists as structural elements
- No LaTeX formulas
- No banned AI phrases ("leverage", "optimize", "game-changer", etc.)
- No spec dumps or comparison tables
- No OEM pricing presented as compatible pricing
- No sales language ("BUY / AVOID", verdict blocks)
- DR4 connector: MPO-12 (never LC)
- DR4 wavelength: 1310nm (never 1550nm)
- 400ZR and DR4 treated as distinct technologies
- No per-port power figures >25W
- No made-up part numbers
- Only CMOS/physics-grounded values
- One core thesis per article
- Flexoptix FINAL OUTCOME TEST: reader finishes ready to validate properly, not defaulting to OEM
## Usage for BlogLLM Training
1. Import these as positive examples into the fine-tuning dataset
2. Each article is ~800-1200 words (production blog length)
3. Type field maps to generation template types in `fo-blog-pipeline.ts`
4. These represent the output quality gate — generated articles should be compared to these for scoring
## Adding More Training Data
Generate via API: `POST /api/blog/generate` with `use_llm: "fo_pipeline"` + Claude provider, then export from DB as additional training examples.