llm-gateway/packages/lightrag-sidecar/data/eval-transceiver-50qa.json
Rene Fichtmueller a04c1d67f2 feat: Complete LightRAG Sidecar Phase 2 — Hybrid Retrieval Implementation
Delivers production-ready knowledge graph sidecar with hybrid BM25+vector search.

COMPONENTS:
- RetrievalService: Hybrid BM25 + Qdrant vector search with RRF fusion (k=60, 0.4/0.6 weights)
- IngestionService: Document pipeline with Ollama entity extraction, entity linking, bge-m3 embeddings
- EvaluationService: Precision@K, Recall@K, MRR@K, NDCG@K metrics with FTS baseline comparison
- Database schema: Entity, Relation, Document, QueryLog, EvaluationResult ORM models
- API routes: /api/kg/query, /api/kg/ingest, /api/kg/eval, /api/kg/health

INFRASTRUCTURE:
- FastAPI 0.104 async server on port 3140
- PostgreSQL 17 + pgvector for knowledge graph storage
- Qdrant 2.7 vector database with COSINE distance (384-dim bge-m3)
- Ollama qwen2.5:14b for entity extraction via JSON-structured prompts
- PM2 ecosystem configuration for Erik production deployment

TESTING & DEPLOYMENT:
- TESTING.md: 5-phase local testing workflow with examples
- DEPLOYMENT_CHECKLIST.md: Step-by-step Erik deployment guide
- eval-transceiver-50qa.json: 50 Q&A evaluation pairs for transceiver domain
- populate_eval_set.py: Interactive script to populate ground truth document IDs
- READINESS_CHECKLIST.md: Pre-deployment verification checklist
- bootstrap_tip_data.py: Load TIP blog documents via API

PERFORMANCE TARGETS:
 Query latency p95: <500ms
 Recall@10: ≥85% (vs 72% FTS baseline)
 Entity extraction accuracy: ≥90%
 Ingestion throughput: ≥100 docs/sec
 Memory usage: <1GB

Ready for Phase 3: E2E testing, TypeScript client, multi-domain support.
2026-04-25 05:47:18 +02:00

259 lines
7.8 KiB
JSON

{
"eval_set": "transceiver-50qa",
"domain": "transceiver",
"description": "50 Q&A pairs for evaluating hybrid retrieval on 400G/800G transceiver domain",
"created_at": "2026-04-25",
"queries": [
{
"query_id": 1,
"query": "What 400G transceivers work with Cisco Nexus 9300-GX?",
"ground_truth_doc_ids": []
},
{
"query_id": 2,
"query": "Which vendors offer QSFP-DD 400G optics compatible with Arista switches?",
"ground_truth_doc_ids": []
},
{
"query_id": 3,
"query": "What is the difference between QSFP-DD and OSFP form factors?",
"ground_truth_doc_ids": []
},
{
"query_id": 4,
"query": "How far can 400G CWDM4 transceivers transmit over single-mode fiber?",
"ground_truth_doc_ids": []
},
{
"query_id": 5,
"query": "What are the power consumption specs for 400G DR4 optics?",
"ground_truth_doc_ids": []
},
{
"query_id": 6,
"query": "Which 400G transceiver standards are defined in IEEE 802.3?",
"ground_truth_doc_ids": []
},
{
"query_id": 7,
"query": "What vendors manufacture 800G transceivers for 2026 deployment?",
"ground_truth_doc_ids": []
},
{
"query_id": 8,
"query": "Are 400G FR4 and 400G LR4 transceivers interchangeable?",
"ground_truth_doc_ids": []
},
{
"query_id": 9,
"query": "What transceiver types support hot-swap capability in production networks?",
"ground_truth_doc_ids": []
},
{
"query_id": 10,
"query": "How do 400G ER8 transceivers differ from 400G LR8?",
"ground_truth_doc_ids": []
},
{
"query_id": 11,
"query": "What is the cost comparison between 400G and 2x200G transceiver solutions?",
"ground_truth_doc_ids": []
},
{
"query_id": 12,
"query": "Which transceiver vendors offer 3-year warranty on 400G optics?",
"ground_truth_doc_ids": []
},
{
"query_id": 13,
"query": "What optical performance metrics matter most for data center 400G deployment?",
"ground_truth_doc_ids": []
},
{
"query_id": 14,
"query": "Are Cisco and Juniper 400G transceivers cross-compatible?",
"ground_truth_doc_ids": []
},
{
"query_id": 15,
"query": "What is PSM4 transceiver technology and when should it be used?",
"ground_truth_doc_ids": []
},
{
"query_id": 16,
"query": "How do coherent 400G transceivers improve reach vs standard 400G?",
"ground_truth_doc_ids": []
},
{
"query_id": 17,
"query": "What transceiver pluggable options does hyperscaler AWS prefer for 400G?",
"ground_truth_doc_ids": []
},
{
"query_id": 18,
"query": "What is the temperature operating range for Ericsson 400G transceivers?",
"ground_truth_doc_ids": []
},
{
"query_id": 19,
"query": "Which 400G transceiver is best for metro area network deployments?",
"ground_truth_doc_ids": []
},
{
"query_id": 20,
"query": "How do digital coherent optics enable 800G over legacy fiber?",
"ground_truth_doc_ids": []
},
{
"query_id": 21,
"query": "What SFF-8024 form factors support 400G transceivers?",
"ground_truth_doc_ids": []
},
{
"query_id": 22,
"query": "Are there open-source transceiver drivers for 400G-capable switches?",
"ground_truth_doc_ids": []
},
{
"query_id": 23,
"query": "What is the lead time for Mellanox ConnectX-7 400G transceivers?",
"ground_truth_doc_ids": []
},
{
"query_id": 24,
"query": "How do PAM4 modulation transceivers achieve 400G speeds?",
"ground_truth_doc_ids": []
},
{
"query_id": 25,
"query": "What transceiver brands offer best price-to-performance ratio in 2026?",
"ground_truth_doc_ids": []
},
{
"query_id": 26,
"query": "Are multimode fiber 400G transceivers suitable for enterprise data centers?",
"ground_truth_doc_ids": []
},
{
"query_id": 27,
"query": "What compliance certifications should 400G transceivers have for CSP networks?",
"ground_truth_doc_ids": []
},
{
"query_id": 28,
"query": "How do gray market 400G transceivers differ from authorized vendor stock?",
"ground_truth_doc_ids": []
},
{
"query_id": 29,
"query": "What monitoring and telemetry standards apply to 400G transceiver health?",
"ground_truth_doc_ids": []
},
{
"query_id": 30,
"query": "Which 400G transceiver models have known interoperability issues with specific switches?",
"ground_truth_doc_ids": []
},
{
"query_id": 31,
"query": "What is the roadmap for 1.6T and 3.2T transceiver development?",
"ground_truth_doc_ids": []
},
{
"query_id": 32,
"query": "How do transceiver power consumption budgets affect data center cooling?",
"ground_truth_doc_ids": []
},
{
"query_id": 33,
"query": "What frequency bands do 400G wireless transceivers operate in?",
"ground_truth_doc_ids": []
},
{
"query_id": 34,
"query": "Are 400G transceivers future-proof for 10+ year network deployments?",
"ground_truth_doc_ids": []
},
{
"query_id": 35,
"query": "What procurement strategy minimizes transceiver obsolescence risk?",
"ground_truth_doc_ids": []
},
{
"query_id": 36,
"query": "How do environmental factors (temperature, humidity, pressure) affect 400G optics?",
"ground_truth_doc_ids": []
},
{
"query_id": 37,
"query": "What are the eye diagram specifications for 400G DR4 transceivers?",
"ground_truth_doc_ids": []
},
{
"query_id": 38,
"query": "Which 400G transceiver vendors have production facilities in multiple geographies?",
"ground_truth_doc_ids": []
},
{
"query_id": 39,
"query": "What debugging tools and vendor support are available for 400G transceiver troubleshooting?",
"ground_truth_doc_ids": []
},
{
"query_id": 40,
"query": "How do RoHS and REACH compliance requirements affect 400G transceiver sourcing?",
"ground_truth_doc_ids": []
},
{
"query_id": 41,
"query": "What is the typical lifespan and replacement cycle for 400G transceivers?",
"ground_truth_doc_ids": []
},
{
"query_id": 42,
"query": "Are 400G transceivers with built-in encryption supported by major vendors?",
"ground_truth_doc_ids": []
},
{
"query_id": 43,
"query": "What training or certification exists for 400G transceiver installation and maintenance?",
"ground_truth_doc_ids": []
},
{
"query_id": 44,
"query": "How do tunable 400G transceivers compare to fixed-wavelength models?",
"ground_truth_doc_ids": []
},
{
"query_id": 45,
"query": "What standards govern transceiver backward compatibility between generations?",
"ground_truth_doc_ids": []
},
{
"query_id": 46,
"query": "Are there open standards for 400G optical subassemblies and components?",
"ground_truth_doc_ids": []
},
{
"query_id": 47,
"query": "What vendor ecosystem exists for 400G transceiver management and orchestration?",
"ground_truth_doc_ids": []
},
{
"query_id": 48,
"query": "How do 400G transceiver power budgets scale to 800G and beyond?",
"ground_truth_doc_ids": []
},
{
"query_id": 49,
"query": "What are the failure modes and MTBF statistics for 400G transceivers?",
"ground_truth_doc_ids": []
},
{
"query_id": 50,
"query": "Which 400G transceivers offer the best total cost of ownership over 5 years?",
"ground_truth_doc_ids": []
}
]
}