Rene Fichtmueller
d69fff36c6
fix(v0.2.2): OLLAMA_URL pointed to localhost instead of .213 via WireGuard
...
Blog engine was falling back to template because qwen2.5:14b is on Mac Studio (.213),
not on Erik (localhost). Fixed ecosystem.config.js to use 192.168.178.213:11434.
This was the root cause why the 10-step pipeline never executed.
2026-03-31 09:28:34 +02:00
Rene Fichtmueller
aa5d03bcf8
fix(blog): harden pipeline prompts based on v0.2.1 blog review feedback
...
System prompt: 10 HARD RULES (non-negotiable, article fails QA without them)
- Mandatory WHAT BREAKS IN PRODUCTION section (2+ specific failures with symptoms/cause/fix)
- Mandatory HIDDEN COSTS section (cleaning, troubleshooting time, cabling redesign, training)
- Mandatory WHEN NOT TO USE section for every recommendation
- Absolute statement rule: NEVER without conditions/context
- Cabling reality: MPO polarity, SR4→DR4 migration, cleaning requirements
- Brutal hook requirement: not 'If you're still...' but 'You're about to sign a PO. Stop.'
- Minimum 2500 words (was 2000)
Step 5 (Reality Injection): Now checks for ALL mandatory sections and adds if missing
Step 9 (QA Check): Hard fail checks — article is NOT publishable without production failures + hidden costs
Feedback source: Human expert review scoring 7.5/10, targeting 9.5+
2026-03-31 09:24:08 +02:00
Rene Fichtmueller
54cfca6774
chore: bump version to v0.2.1
2026-03-31 09:19:38 +02:00
Rene Fichtmueller
98a7e12282
feat: wire 10-step FO Blog Pipeline into blog generation route
...
Replaces old 2-pass pipeline with full Flexoptix Style 10-step generation:
1. Topic Expansion (real scenarios + wrong assumptions)
2. Angle Selection (single strong angle + audience)
3. Outline Generation (decision-driven, no generic sections)
4. Master Draft (Flexoptix voice, 2000+ words)
5. Reality Injection (failure scenarios, operational pain)
6. Technical Deepening (specific optics, power, density)
7. Opinion Layer (clear positions, no neutrality)
8. Kill AI Tone (remove all AI fingerprints)
9. QA Check (technical accuracy verification)
10. Quality Score (1-10 auto-rating, saved as self-feedback)
Feedback loop active:
- Accumulated feedback injected into system prompt
- Auto QA scores saved to blog_feedback table
- Training data export via GET /api/blog/feedback/training-data
2026-03-31 09:16:23 +02:00
Rene Fichtmueller
73ef5766e6
feat(v0.2.1): data confidence tracking + validation + blog feedback system
...
- Migration 016: data_confidence column (vendor_verified/enriched_estimated/scraped_unverified)
- Migration 015: blog_feedback table with 8 quality scores + free text
- Validation script: 8 physics-based rules (wavelength↔fiber, reach plausibility, power limits)
- Blog feedback API: POST /api/blog/:id/feedback + training data export
- FO Blog Pipeline v3: 10-step Flexoptix Style prompts (Less bullshit. More engineering.)
- Auto-fix: wavelength↔fiber mismatches corrected automatically
2026-03-31 09:12:37 +02:00
Rene Fichtmueller
ad7dc6fcaa
chore: bump version to 0.2.0 in health endpoint
2026-03-31 08:59:00 +02:00
Rene Fichtmueller
a53c00b8c4
fix: UUID cast in datasheet routes — use slug-first lookup
2026-03-31 08:58:26 +02:00
Rene Fichtmueller
3a6cbc475d
feat(v0.2.0): datasheets + adoption roadmap + all routes registered
...
- GET /api/datasheets/transceiver/:id — Full datasheet with power budget, pricing, compatibility, HTML export
- GET /api/datasheets/switch/:id — Switch datasheet with compatible transceivers
- GET /api/adoption — Full technology roadmap with maturity indicators
- GET /api/adoption/:technology — Detailed adoption analysis, migration paths, risks, timelines
- All v0.2.0 routes registered in index.ts
2026-03-31 08:57:03 +02:00
Rene Fichtmueller
aa977abc97
feat(v0.2.0): Sales Intelligence Engine — Phase 0+A
...
New API routes:
- GET /api/finder — Switch→Flexoptix transceiver finder with FlexBox coding
- GET /api/competitor-alerts — Competitor intelligence (price changes, new products, stock)
- GET /api/forecast/:technology — Sales forecast 3/9/12/18 months + buy/wait/hold signal
- POST /api/transport/plan — Transport system planner (city→city BOM with fiber providers)
New MCP tools:
- find_flexoptix_for_switch — Customer switch → Flexoptix products
- get_competitor_alerts — Competitor monitoring
- plan_transport — Network transport planning
- forecast_sales — Volume/revenue prediction
- generate_blog — Enhanced blog generation
New DB tables (migration 013):
- competitor_alerts, price_changes, flexoptix_product_map
- sales_forecasts, fiber_providers, fiber_routes, cities
- generated_datasheets, blog_series
- Views: v_price_coverage, v_image_coverage, v_switch_flexoptix_finder
Seed data (migration 014):
- 25 European cities with IX/DC locations + coordinates
- 15 fiber providers (euNetworks, Telia, DTAG, Colt, Zayo, etc.)
- 16 fiber routes with pricing (Germany focus)
Infrastructure:
- Scraper scheduler: 2h Flexoptix, 4h FS.com/Optcore (was 6-8h)
- Change detector for competitor price/stock monitoring
- Image downloader utility with coverage tracking
2026-03-31 08:51:22 +02:00
Rene Fichtmueller
5a0cbed5a2
feat: dashboard v2, blog expansion, market/cable MCP tools, switch asset scrapers, scraper utilities
2026-03-30 08:07:12 +02:00
Rene Fichtmueller
f940bf2cd4
fix: remove non-existent vendor URL columns, fix text=uuid cast in transceiver lookup
2026-03-30 07:49:54 +02:00
Rene Fichtmueller
891bd018a8
fix: add trust proxy for Cloudflare — fixes ERR_ERL_UNEXPECTED_X_FORWARDED_FOR in rate limiter
2026-03-30 06:41:36 +02:00
Rene Fichtmueller
4b452ab49e
feat(scrapers+mcp): ATGBICS + ProLabs scrapers, MCP HTTP/SSE server
...
Scrapers:
- atgbics.ts: PlaywrightCrawler for UK vendor ATGBICS (Shopify store),
scrapes SFP/SFP+/SFP28/QSFP+/QSFP28/QSFP-DD in GBP, max 50 pages/run
- prolabs.ts: HttpCrawler for ProLabs (Legrand subsidiary), USD pricing,
category-driven crawl with reach/fiber/speed detection
- Both registered in scheduler (every 8h, staggered) and index.ts CLI
MCP HTTP Server:
- packages/mcp-server/src/http-server.ts: Express + SSEServerTransport
- Exposes all 12 TIP tools via GET /sse + POST /message
- Bearer token auth (MCP_SECRET env), CORS-configurable
- GET /health → { status: "ok", tools: 12 }
- Port: MCP_HTTP_PORT (default 3201)
SQL + tools:
- sql/006-009: seed scripts for whitebox switches, vendors, assets
- switch-docs.ts: MCP tool for switch documentation queries
2026-03-29 02:26:45 +08:00
Rene Fichtmueller
66b722a5e4
feat: calibrate regional adoption model with research-backed parameters
...
Update REGIONAL_LAGS with data from LightCounting, vendor earnings,
OFC market sessions, and Chinese IPO prospectuses. Add price index
per region and segment mix (hyperscaler/telco/enterprise) for
more accurate regional revenue modeling.
2026-03-28 02:34:29 +13:00
Rene Fichtmueller
c6308e93c0
feat: massive scraper expansion + hype cycle engine + lifecycle prediction
...
New scrapers:
- GBICS.com (BigCommerce, GBP prices, 10 categories, 78 products)
- Juniper HCT (Next.js SSR parser, 475 transceivers with specs/EOL)
- SFPcables.com (Magento store, 16 categories, 78 products)
- Fluxlight (BigCommerce, 6 pages, 118 products)
- Champion ONE (compatible vendor scraper)
Scraper fixes:
- 10Gtek: rewritten to parse HTML spec tables (152 products)
- Flexoptix: fix price extraction from Magento Hyva HTML
- Register all scrapers in CLI (--gbics, --juniper, --sfpcables, etc.)
Hype Cycle Engine enhancements:
- Data-driven enrichment from scraped vendor/price data
- Revenue lifecycle prediction (peak year, decline, revenue index)
- Regional adoption model (NA, China, APAC, Europe, RoW with lag coefficients)
- New API endpoints: /enriched, /lifecycle, /regional/:tech
DB growth: 89 → 1,168 transceivers, 0 → 416 prices, 6 vendors
Qdrant: 1,162 products embedded with nomic-embed-text
Research: Norton-Bass model, standards-to-market timelines, hype signals
2026-03-28 02:30:19 +13:00
Rene Fichtmueller
46af736db3
fix: hype cycle findTechnology matched wrong tech (1G instead of 1.6T)
...
findTechnology used loose includes() matching — '1.6T OSFP-XD' matched
'1G SFP' first because query contained '1'. Now matches exact name first,
then by speed prefix with proper unit parsing (G/T).
2026-03-28 01:00:52 +13:00
Rene Fichtmueller
1b0b602aa4
feat: Phase 8 — Dashboard frontend + static serving
...
Single-file dashboard with 6 tabs: Overview, Semantic Search,
Hype Cycle, Transceivers, News, Blog Engine. Dark theme, no
build step, served as static HTML from Express.
- Overview: health stats, vector collection counts, recent news
- Semantic Search: query across all 6 Qdrant collections
- Hype Cycle: Norton-Bass table with phase colors + position bars
- Transceivers: searchable table with form factor/speed/reach
- News: semantic news search with source links
- Blog: generate drafts from templates, view draft history
Live at: https://transceiver-db.context-x.org/dashboard/
2026-03-28 00:37:10 +13:00
Rene Fichtmueller
f48a809e40
feat: Phase 7 — Blog generator + scraper scheduler activation
...
Blog draft engine generates structured markdown from all Qdrant
collections (products, news, FAQ, troubleshooting). Supports 4
topic types: hype_cycle, comparison, new_product, tutorial.
- routes/blog.ts: POST /api/blog/generate, GET/PUT endpoints
- ecosystem.config.js: Added tip-scraper PM2 process
- Scraper scheduler (pg-boss) now running on Erik with 8 job queues
- News scraper running every 6 hours on Erik
2026-03-28 00:32:08 +13:00
Rene Fichtmueller
0a63307505
feat: Phase 6 — FAQ + troubleshooting knowledge base embeddings
...
19 curated FAQ entries covering form factors, fiber types, reach,
compatibility, WDM, power, and emerging tech (CPO, LPO, 400ZR).
10 troubleshooting guides with symptom/cause/solution format.
All 6 Qdrant collections now populated:
- product_embeddings: 89 transceivers
- datasheet_chunks: 40 chunks (OCR pipeline)
- faq_embeddings: 19 FAQ entries
- troubleshooting_embeddings: 10 guides
- news_embeddings: 33 articles
- manual_chunks: 0 (pending manual ingestion)
2026-03-28 00:24:50 +13:00
Rene Fichtmueller
122ca8444d
feat: Phase 5 — OCR pipeline + document/news search
...
Docling-powered OCR pipeline: PDF → markdown → chunks → Ollama embed → Qdrant.
News embedding seeder for news_embeddings collection.
Document and news semantic search API endpoints.
- embeddings/ocr-pipeline.ts: Docling convert → chunk → embed pipeline
- embeddings/seed-news.ts: Batch embed news_articles into Qdrant
- routes/documents.ts: POST /api/documents/process, GET /api/documents
- routes/search.ts: GET /search/documents, GET /search/news endpoints
- sql/005-documents.sql: Add chunks_count, processed_at to documents table
- Ollama + nomic-embed-text installed on Erik (CPU mode)
- 89 products + 40 datasheet chunks + 33 news articles in Qdrant
2026-03-28 00:22:01 +13:00
Rene Fichtmueller
0260d0b365
feat: Phase 4 — Vector embeddings + semantic search
...
Ollama nomic-embed-text (768 dim) → Qdrant vector search pipeline.
Embeds all 89 transceivers with rich text representation and payload
filters (form_factor, speed_gbps, fiber_type, wdm_type).
- embeddings/client.ts: Ollama embed + Qdrant upsert/search
- embeddings/seed-products.ts: Batch seeder for product_embeddings
- routes/search.ts: GET /api/search, /search/products, /search/stats
- 6 Qdrant collections: products, datasheets, FAQs, manuals, troubleshooting, news
2026-03-28 00:05:29 +13:00
Rene Fichtmueller
a6f7968393
feat: Phase 3 — Norton-Bass Hype Cycle Engine
...
Implements the full Norton-Bass Multigenerational Diffusion Model for
transceiver technology lifecycle forecasting.
Math: Bass diffusion F(t) + logistic adoption S(t) = L / (1 + e^(-k(t-t0)))
Parameters: p (innovation ~0.03), q (imitation ~0.3-0.5), m (market potential)
Phase Classification Engine (composite score):
30% Port shipment share + 20% ASP decline rate + 15% Standards maturity
+ 15% Interop validation + 10% Vendor trajectory + 10% Media sentiment
11 technologies tracked: 1G → 10G → 25G → 40G → 100G → 400G → 800G → 1.6T
+ CPO, LPO, 400ZR Coherent
5-year adoption forecast per technology
API: GET /api/hype-cycle (all) + GET /api/hype-cycle/:tech (detail)
Live: https://transceiver-db.context-x.org/api/hype-cycle
2026-03-27 23:35:57 +13:00
Rene Fichtmueller
b43bdd3060
feat: TIP Phase 0+1 — monorepo, DB schema, API, scraper engine
...
Phase 0 - Foundation:
- Restructure into npm workspace monorepo (packages/core, api, scraper)
- PostgreSQL 17 + TimescaleDB schema (15 tables incl. hypertables)
- Docker Compose for local dev (PostgreSQL on 5433 + Qdrant)
- Express 5 API on port 3200 with 6 routes
- Seed script to migrate 159 transceivers + 42 standards from npm package
- Erik server setup script + PM2 ecosystem config
Phase 1 - Scraper Engine:
- Crawlee + Playwright framework with pg-boss scheduler
- FS.com scraper (PlaywrightCrawler, anti-bot workaround)
- Optcore.net scraper (WP REST API enumeration + PlaywrightCrawler)
- Uses /wp-json/wp/v2/product to get 2000+ product URLs
- Playwright renders individual product pages for price extraction
- Cisco TMG Matrix scraper (compatibility data)
- News RSS aggregator (optics.org, SPIE, Network World, Nature Photonics)
- Keyword relevance scoring for transceiver/fiber topics
- xml2js with malformed XML sanitization
- SHA-256 content hashing for change detection (skip unchanged records)
- pg-boss v10 with explicit queue creation before scheduling
2026-03-27 16:27:31 +13:00