311 Commits

Author SHA1 Message Date
Rene Fichtmueller
8bb3b586f3 feat: Phase 5 — OCR pipeline + document/news search
Docling-powered OCR pipeline: PDF → markdown → chunks → Ollama embed → Qdrant.
News embedding seeder for news_embeddings collection.
Document and news semantic search API endpoints.

- embeddings/ocr-pipeline.ts: Docling convert → chunk → embed pipeline
- embeddings/seed-news.ts: Batch embed news_articles into Qdrant
- routes/documents.ts: POST /api/documents/process, GET /api/documents
- routes/search.ts: GET /search/documents, GET /search/news endpoints
- sql/005-documents.sql: Add chunks_count, processed_at to documents table
- Ollama + nomic-embed-text installed on Erik (CPU mode)
- 89 products + 40 datasheet chunks + 33 news articles in Qdrant
2026-03-28 00:22:01 +13:00
Rene Fichtmueller
6d3e5cc04a feat: Phase 4 — Vector embeddings + semantic search
Ollama nomic-embed-text (768 dim) → Qdrant vector search pipeline.
Embeds all 89 transceivers with rich text representation and payload
filters (form_factor, speed_gbps, fiber_type, wdm_type).

- embeddings/client.ts: Ollama embed + Qdrant upsert/search
- embeddings/seed-products.ts: Batch seeder for product_embeddings
- routes/search.ts: GET /api/search, /search/products, /search/stats
- 6 Qdrant collections: products, datasheets, FAQs, manuals, troubleshooting, news
2026-03-28 00:05:29 +13:00
Rene Fichtmueller
eb875f37d2 feat: Phase 3 — Norton-Bass Hype Cycle Engine
Implements the full Norton-Bass Multigenerational Diffusion Model for
transceiver technology lifecycle forecasting.

Math: Bass diffusion F(t) + logistic adoption S(t) = L / (1 + e^(-k(t-t0)))
Parameters: p (innovation ~0.03), q (imitation ~0.3-0.5), m (market potential)

Phase Classification Engine (composite score):
  30% Port shipment share + 20% ASP decline rate + 15% Standards maturity
  + 15% Interop validation + 10% Vendor trajectory + 10% Media sentiment

11 technologies tracked: 1G → 10G → 25G → 40G → 100G → 400G → 800G → 1.6T
  + CPO, LPO, 400ZR Coherent
5-year adoption forecast per technology

API: GET /api/hype-cycle (all) + GET /api/hype-cycle/:tech (detail)
Live: https://transceiver-db.context-x.org/api/hype-cycle
2026-03-27 23:35:57 +13:00
Rene Fichtmueller
bd3a02ae4b feat: add Flexoptix vendor scraper, 10Gtek pricing scraper, expand news feeds
- Flexoptix vendor scraper: 285 supported switch vendors ingested from
  flexoptix.net/en/supported-vendors/ (our own data, no restrictions)
- 10Gtek Playwright scraper: Chinese OEM competitor pricing (SFP+, SFP28,
  QSFP+, QSFP28, QSFP-DD categories)
- News feeds expanded: added Lightwave, Fierce Telecom, Data Center Knowledge,
  SDxCentral, Cisco Blogs, Arista Blog (11 total sources)
- Scheduler updated: 8 job queues with appropriate intervals
- DB now: 297 vendors, 89 transceivers, 33 news articles (13 relevant)
2026-03-27 23:17:42 +13:00
Rene Fichtmueller
92f42832bf feat: Phase 2 — MCP Server with 12 tools
Implements all 12 MCP tools from CONCEPT document:
- search_transceivers: Full-text + spec filter search with pricing
- check_compatibility: Switch ↔ transceiver compatibility lookup
- get_pricing: Current prices + 30-day history across all vendors
- compare_prices: Multi-vendor price comparison with savings analysis
- get_competitor_stock: Live competitor stock monitoring (sales opportunities)
- suggest_alternatives: Similar spec alternatives optimized for price/availability
- get_templates: FlexBox coding and switch config template finder
- search_knowledge_base: Troubleshooting FAQ search (PostgreSQL full-text)
- search_manuals: Switch manual and datasheet search
- get_hype_cycle: Norton-Bass adoption forecast + Gartner phase classification
- get_market_news: Aggregated news with relevance scoring
- generate_blog_draft: Data-driven blog drafts saved to blog_drafts table

Transport: stdio (MCP protocol 2024-11-05)
Config: .mcp.json for Claude Code integration
Verified: all 12 tools registered, search_transceivers returns DB results
2026-03-27 16:48:34 +13:00
Rene Fichtmueller
e9fb50a248 feat: TIP Phase 0+1 — monorepo, DB schema, API, scraper engine
Phase 0 - Foundation:
- Restructure into npm workspace monorepo (packages/core, api, scraper)
- PostgreSQL 17 + TimescaleDB schema (15 tables incl. hypertables)
- Docker Compose for local dev (PostgreSQL on 5433 + Qdrant)
- Express 5 API on port 3200 with 6 routes
- Seed script to migrate 159 transceivers + 42 standards from npm package
- Erik server setup script + PM2 ecosystem config

Phase 1 - Scraper Engine:
- Crawlee + Playwright framework with pg-boss scheduler
- FS.com scraper (PlaywrightCrawler, anti-bot workaround)
- Optcore.net scraper (WP REST API enumeration + PlaywrightCrawler)
  - Uses /wp-json/wp/v2/product to get 2000+ product URLs
  - Playwright renders individual product pages for price extraction
- Cisco TMG Matrix scraper (compatibility data)
- News RSS aggregator (optics.org, SPIE, Network World, Nature Photonics)
  - Keyword relevance scoring for transceiver/fiber topics
  - xml2js with malformed XML sanitization
- SHA-256 content hashing for change detection (skip unchanged records)
- pg-boss v10 with explicit queue creation before scheduling
2026-03-27 16:27:31 +13:00
Rene Fichtmueller
ddd0a592aa Add live demo link: https://transceiver-db-demo.pages.dev 2026-03-19 15:42:02 +10:00
Rene Fichtmueller
c16e36c628 Fix FLEXOPTIX HQ: Darmstadt, not Braunschweig 2026-03-19 15:37:30 +10:00
Rene Fichtmueller
d04b3cf9e1 Add FLEXOPTIX as compatible vendor with FlexBox USP, update FS.COM comparison 2026-03-19 15:35:37 +10:00
Rene Fichtmueller
7aba2149b5 Add interactive live demo for Cloudflare Pages deployment 2026-03-19 15:28:20 +10:00
Rene Fichtmueller
f3d12afc02 Initial release: optical transceiver database with 89 products, 39 standards, 12 competitors 2026-03-19 15:14:47 +10:00