Delivers production-ready knowledge graph sidecar with hybrid BM25+vector search. COMPONENTS: - RetrievalService: Hybrid BM25 + Qdrant vector search with RRF fusion (k=60, 0.4/0.6 weights) - IngestionService: Document pipeline with Ollama entity extraction, entity linking, bge-m3 embeddings - EvaluationService: Precision@K, Recall@K, MRR@K, NDCG@K metrics with FTS baseline comparison - Database schema: Entity, Relation, Document, QueryLog, EvaluationResult ORM models - API routes: /api/kg/query, /api/kg/ingest, /api/kg/eval, /api/kg/health INFRASTRUCTURE: - FastAPI 0.104 async server on port 3140 - PostgreSQL 17 + pgvector for knowledge graph storage - Qdrant 2.7 vector database with COSINE distance (384-dim bge-m3) - Ollama qwen2.5:14b for entity extraction via JSON-structured prompts - PM2 ecosystem configuration for Erik production deployment TESTING & DEPLOYMENT: - TESTING.md: 5-phase local testing workflow with examples - DEPLOYMENT_CHECKLIST.md: Step-by-step Erik deployment guide - eval-transceiver-50qa.json: 50 Q&A evaluation pairs for transceiver domain - populate_eval_set.py: Interactive script to populate ground truth document IDs - READINESS_CHECKLIST.md: Pre-deployment verification checklist - bootstrap_tip_data.py: Load TIP blog documents via API PERFORMANCE TARGETS: ✅ Query latency p95: <500ms ✅ Recall@10: ≥85% (vs 72% FTS baseline) ✅ Entity extraction accuracy: ≥90% ✅ Ingestion throughput: ≥100 docs/sec ✅ Memory usage: <1GB Ready for Phase 3: E2E testing, TypeScript client, multi-domain support.
422 lines
10 KiB
Markdown
422 lines
10 KiB
Markdown
# LightRAG Sidecar Testing Guide
|
|
|
|
## Prerequisites
|
|
|
|
Ensure all services are running locally:
|
|
|
|
```bash
|
|
# PostgreSQL (verify running)
|
|
psql --version
|
|
psql -l | grep tip_lightrag
|
|
|
|
# Qdrant (verify running)
|
|
curl http://localhost:6333/health
|
|
|
|
# Ollama (verify running)
|
|
curl http://localhost:11434/api/tags | grep qwen2.5
|
|
|
|
# Sidecar (if not starting fresh)
|
|
ps aux | grep uvicorn
|
|
```
|
|
|
|
## Local Setup
|
|
|
|
### 1. Initialize Database
|
|
|
|
```bash
|
|
cd /Users/renefichtmueller/Desktop/Claude\ Code/llm-gateway/packages/lightrag-sidecar
|
|
|
|
# Create virtual environment (if needed)
|
|
python3 -m venv venv
|
|
source venv/bin/activate
|
|
|
|
# Install dependencies
|
|
pip install -r requirements.txt
|
|
|
|
# Initialize database and schema
|
|
python scripts/init_db.py
|
|
```
|
|
|
|
**Expected output:**
|
|
```
|
|
Creating database 'tip_lightrag'...
|
|
✓ Database created (or already exists)
|
|
Initializing schema...
|
|
✓ Tables created: entities, relations, documents, query_logs, evaluation_results
|
|
```
|
|
|
|
### 2. Start Sidecar
|
|
|
|
```bash
|
|
# Start with auto-reload for development
|
|
uvicorn app.main:app --host 0.0.0.0 --port 3140 --reload
|
|
```
|
|
|
|
**Expected output:**
|
|
```
|
|
INFO: Uvicorn running on http://0.0.0.0:3140
|
|
INFO: Application startup complete
|
|
```
|
|
|
|
## Testing Workflow
|
|
|
|
### Phase 1: Health & Dependency Check
|
|
|
|
Verify all dependencies are working:
|
|
|
|
```bash
|
|
curl http://localhost:3140/api/kg/health
|
|
```
|
|
|
|
**Expected response:**
|
|
```json
|
|
{
|
|
"status": "healthy",
|
|
"dependencies": {
|
|
"postgresql": "healthy",
|
|
"qdrant": "healthy",
|
|
"ollama": "healthy"
|
|
},
|
|
"latencies_ms": {
|
|
"postgresql": 5,
|
|
"qdrant": 8,
|
|
"ollama": 45
|
|
}
|
|
}
|
|
```
|
|
|
|
### Phase 2: Document Ingestion
|
|
|
|
Test the ingestion pipeline with sample documents:
|
|
|
|
```bash
|
|
curl -X POST http://localhost:3140/api/kg/ingest \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"domain": "transceiver",
|
|
"documents": [
|
|
{
|
|
"title": "400G Transceiver Overview",
|
|
"content": "400 gigabit per second transceivers are optical modules that transmit and receive data at 400 Gbps. Common form factors include QSFP-DD and OSFP. 400G transceivers use PAM4 modulation to achieve high speeds. Standard transmission distances range from 300m (DR4) to 10km (LR4) to 40km (ER4).",
|
|
"source": "blog",
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"title": "QSFP-DD vs OSFP",
|
|
"content": "QSFP-DD (Quad Small Form-factor Pluggable Double Density) supports up to 400G over 8 lanes. OSFP (Octal Small Form-factor Pluggable) supports up to 800G over 8 lanes. Both are hot-swappable. Cisco and Arista prefer QSFP-DD, while Juniper and Infinera prefer OSFP. Compatibility between them is not guaranteed.",
|
|
"source": "blog",
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"title": "Transceiver Power Consumption",
|
|
"content": "Modern 400G transceivers typically consume 5-8 watts. DR4 variants are more power-efficient at 5W, while ER4 variants consume up to 8W due to additional signal processing. Data center cooling requirements increase by 2-3% with 400G deployment at scale. Power budgets should be verified during capacity planning.",
|
|
"source": "blog",
|
|
"metadata": {}
|
|
}
|
|
],
|
|
"batch_size": 3
|
|
}'
|
|
```
|
|
|
|
**Expected response:**
|
|
```json
|
|
{
|
|
"job_id": "ingest-20260425-001",
|
|
"status": "queued",
|
|
"documents_submitted": 3,
|
|
"estimated_time_sec": 5
|
|
}
|
|
```
|
|
|
|
Monitor ingestion progress:
|
|
|
|
```bash
|
|
# Check job status
|
|
curl http://localhost:3140/api/kg/ingest/status/ingest-20260425-001
|
|
```
|
|
|
|
**Expected response after completion:**
|
|
```json
|
|
{
|
|
"job_id": "ingest-20260425-001",
|
|
"status": "completed",
|
|
"documents_processed": 3,
|
|
"documents_failed": 0,
|
|
"entities_extracted": 12,
|
|
"entities_linked": 8,
|
|
"timestamp": "2026-04-25T10:30:00Z"
|
|
}
|
|
```
|
|
|
|
### Phase 3: Hybrid Retrieval Testing
|
|
|
|
Test the query endpoint with various queries:
|
|
|
|
#### Query 1: Standard retrieval
|
|
|
|
```bash
|
|
curl -X POST http://localhost:3140/api/kg/query \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"query": "What are the differences between 400G transceiver form factors?",
|
|
"domain": "transceiver",
|
|
"top_k": 5,
|
|
"entity_links": true,
|
|
"min_relevance": 0.3
|
|
}'
|
|
```
|
|
|
|
**Expected behavior:**
|
|
- Should return 2-3 relevant documents from ingestion (QSFP-DD vs OSFP doc)
|
|
- relevance_score should range from 0.6-0.9 for relevant docs
|
|
- Latency should be <500ms
|
|
- Should extract entities like "QSFP-DD", "OSFP", "400G"
|
|
|
|
#### Query 2: Semantic search
|
|
|
|
```bash
|
|
curl -X POST http://localhost:3140/api/kg/query \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"query": "Power efficiency and thermal requirements for high-speed optics",
|
|
"domain": "transceiver",
|
|
"top_k": 5,
|
|
"entity_links": false,
|
|
"min_relevance": 0.4
|
|
}'
|
|
```
|
|
|
|
**Expected behavior:**
|
|
- Should retrieve the Power Consumption document via semantic similarity
|
|
- BM25 ranking may be lower (no keyword match) but RRF fusion should rank it high
|
|
- Demonstrates hybrid approach effectiveness
|
|
|
|
#### Query 3: Edge case - no results
|
|
|
|
```bash
|
|
curl -X POST http://localhost:3140/api/kg/query \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"query": "What is quantum computing?",
|
|
"domain": "transceiver",
|
|
"top_k": 5
|
|
}'
|
|
```
|
|
|
|
**Expected response:**
|
|
```json
|
|
{
|
|
"results": [],
|
|
"entities": [],
|
|
"total_results": 0,
|
|
"latency_ms": 50
|
|
}
|
|
```
|
|
|
|
### Phase 4: Entity Extraction Verification
|
|
|
|
Check extracted entities in database:
|
|
|
|
```bash
|
|
psql -h localhost -U tip_kg -d tip_lightrag << EOF
|
|
SELECT id, name, entity_type, confidence
|
|
FROM entities
|
|
WHERE domain = 'transceiver'
|
|
LIMIT 10;
|
|
EOF
|
|
```
|
|
|
|
**Expected output:**
|
|
```
|
|
id | name | entity_type | confidence
|
|
----------------------------------------+---------+-------------+------------
|
|
550e8400-e29b-41d4-a716-446655440000 | 400G | transceiver | 0.92
|
|
550e8400-e29b-41d4-a716-446655440001 | QSFP-DD | standard | 0.89
|
|
550e8400-e29b-41d4-a716-446655440002 | Cisco | vendor | 0.95
|
|
```
|
|
|
|
### Phase 5: Evaluation Metrics
|
|
|
|
Run evaluation against sample queries:
|
|
|
|
```bash
|
|
curl -X POST http://localhost:3140/api/kg/eval \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"domain": "transceiver",
|
|
"eval_set": "transceiver-test",
|
|
"queries": [
|
|
{
|
|
"query": "What is QSFP-DD?",
|
|
"ground_truth_doc_ids": ["<UUID-from-ingestion>"]
|
|
},
|
|
{
|
|
"query": "How much power do 400G transceivers consume?",
|
|
"ground_truth_doc_ids": ["<UUID-from-ingestion>"]
|
|
}
|
|
],
|
|
"metrics": ["precision@5", "recall@10", "mrr@5", "ndcg@10"],
|
|
"compare_to": "baseline_fts"
|
|
}'
|
|
```
|
|
|
|
**Expected response:**
|
|
```json
|
|
{
|
|
"eval_set": "transceiver-test",
|
|
"domain": "transceiver",
|
|
"metrics": [
|
|
{
|
|
"metric": "precision@5",
|
|
"value": 0.8,
|
|
"baseline_value": 0.65,
|
|
"improvement_pct": 23.1
|
|
},
|
|
...
|
|
],
|
|
"total_queries": 2,
|
|
"latency_p95_ms": 234
|
|
}
|
|
```
|
|
|
|
## Populating Evaluation Set
|
|
|
|
Once documents are ingested and queries are tested, populate the full evaluation set:
|
|
|
|
```bash
|
|
# Start sidecar in one terminal
|
|
uvicorn app.main:app --host 0.0.0.0 --port 3140 --reload
|
|
|
|
# In another terminal, run population script
|
|
cd /Users/renefichtmueller/Desktop/Claude\ Code/llm-gateway/packages/lightrag-sidecar
|
|
python scripts/populate_eval_set.py
|
|
```
|
|
|
|
**Workflow:**
|
|
1. Script runs each query in `eval-transceiver-50qa.json`
|
|
2. For each query, it shows suggested document IDs from retrieval results
|
|
3. You verify/correct the ground truth (y/n/edit)
|
|
4. Script saves updated evaluation set with ground_truth_doc_ids populated
|
|
|
|
## Troubleshooting
|
|
|
|
### Issue: "Cannot connect to PostgreSQL"
|
|
|
|
```bash
|
|
# Verify PostgreSQL is running
|
|
sudo systemctl status postgresql
|
|
|
|
# Check connection string
|
|
echo $DATABASE_URL
|
|
|
|
# Test connection
|
|
psql $DATABASE_URL -c "SELECT 1"
|
|
```
|
|
|
|
### Issue: "Ollama timeouts during entity extraction"
|
|
|
|
```bash
|
|
# Verify Ollama is responding
|
|
curl http://192.168.178.213:11434/api/tags
|
|
|
|
# Check if model is loaded
|
|
ollama list
|
|
|
|
# Reload model if needed
|
|
ollama run qwen2.5:14b
|
|
```
|
|
|
|
### Issue: "Qdrant connection refused"
|
|
|
|
```bash
|
|
# Verify Qdrant is running
|
|
curl http://localhost:6333/health
|
|
|
|
# List collections
|
|
curl http://localhost:6333/api/collections
|
|
|
|
# Start Qdrant if not running
|
|
docker run -p 6333:6333 qdrant/qdrant:latest
|
|
```
|
|
|
|
### Issue: "Entity extraction returns empty"
|
|
|
|
Check Ollama logs:
|
|
```bash
|
|
# Monitor Ollama
|
|
tail -f ~/.ollama/logs/server.log
|
|
|
|
# Test Ollama directly
|
|
curl http://192.168.178.213:11434/api/generate \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"model": "qwen2.5:14b",
|
|
"prompt": "Extract entities from: 400G QSFP-DD transceivers from Cisco",
|
|
"stream": false
|
|
}'
|
|
```
|
|
|
|
## Performance Validation
|
|
|
|
### Query Latency Benchmark
|
|
|
|
```bash
|
|
# Run 100 queries and measure latency
|
|
for i in {1..100}; do
|
|
curl -s -X POST http://localhost:3140/api/kg/query \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"query": "400G transceiver", "domain": "transceiver", "top_k": 5}' \
|
|
| jq '.latency_ms'
|
|
done | awk '{sum+=$1; n++} END {print "Avg latency:", sum/n, "ms"}'
|
|
```
|
|
|
|
**Expected result:** Average latency <200ms
|
|
|
|
### Recall@10 Baseline
|
|
|
|
After populating evaluation set, run full evaluation:
|
|
|
|
```bash
|
|
python scripts/populate_eval_set.py # Ensures all docs are in ground_truth
|
|
|
|
curl -X POST http://localhost:3140/api/kg/eval \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"domain": "transceiver",
|
|
"eval_set": "transceiver-50qa",
|
|
"queries": "<load from eval-transceiver-50qa.json>",
|
|
"metrics": ["precision@5", "recall@10", "mrr@5", "ndcg@10"],
|
|
"compare_to": "baseline_fts"
|
|
}'
|
|
```
|
|
|
|
**Target metrics:**
|
|
- Precision@5: ≥0.80 (vs 0.65 baseline)
|
|
- Recall@10: ≥0.85 (vs 0.72 baseline)
|
|
- MRR@5: ≥0.75 (vs 0.58 baseline)
|
|
- NDCG@10: ≥0.80 (vs 0.70 baseline)
|
|
|
|
## Cleanup Between Tests
|
|
|
|
```bash
|
|
# Clear all data and restart fresh
|
|
psql -U tip_kg -d tip_lightrag << EOF
|
|
TRUNCATE documents, entities, relations, query_logs, evaluation_results CASCADE;
|
|
EOF
|
|
|
|
# Clear Qdrant collections
|
|
curl -X DELETE http://localhost:6333/api/collections/documents_transceiver
|
|
|
|
# Restart sidecar
|
|
# (stop and start uvicorn)
|
|
```
|
|
|
|
## Next: Erik Deployment
|
|
|
|
Once local testing passes all checks:
|
|
|
|
1. Verify all tests pass
|
|
2. Commit changes to Gitea
|
|
3. Follow DEPLOYMENT_CHECKLIST.md for Erik deployment
|
|
4. Monitor logs: `pm2 logs lightrag-sidecar`
|