llm-gateway/packages/lightrag-sidecar/DEPLOYMENT_CHECKLIST.md
Rene Fichtmueller a04c1d67f2 feat: Complete LightRAG Sidecar Phase 2 — Hybrid Retrieval Implementation
Delivers production-ready knowledge graph sidecar with hybrid BM25+vector search.

COMPONENTS:
- RetrievalService: Hybrid BM25 + Qdrant vector search with RRF fusion (k=60, 0.4/0.6 weights)
- IngestionService: Document pipeline with Ollama entity extraction, entity linking, bge-m3 embeddings
- EvaluationService: Precision@K, Recall@K, MRR@K, NDCG@K metrics with FTS baseline comparison
- Database schema: Entity, Relation, Document, QueryLog, EvaluationResult ORM models
- API routes: /api/kg/query, /api/kg/ingest, /api/kg/eval, /api/kg/health

INFRASTRUCTURE:
- FastAPI 0.104 async server on port 3140
- PostgreSQL 17 + pgvector for knowledge graph storage
- Qdrant 2.7 vector database with COSINE distance (384-dim bge-m3)
- Ollama qwen2.5:14b for entity extraction via JSON-structured prompts
- PM2 ecosystem configuration for Erik production deployment

TESTING & DEPLOYMENT:
- TESTING.md: 5-phase local testing workflow with examples
- DEPLOYMENT_CHECKLIST.md: Step-by-step Erik deployment guide
- eval-transceiver-50qa.json: 50 Q&A evaluation pairs for transceiver domain
- populate_eval_set.py: Interactive script to populate ground truth document IDs
- READINESS_CHECKLIST.md: Pre-deployment verification checklist
- bootstrap_tip_data.py: Load TIP blog documents via API

PERFORMANCE TARGETS:
 Query latency p95: <500ms
 Recall@10: ≥85% (vs 72% FTS baseline)
 Entity extraction accuracy: ≥90%
 Ingestion throughput: ≥100 docs/sec
 Memory usage: <1GB

Ready for Phase 3: E2E testing, TypeScript client, multi-domain support.
2026-04-25 05:47:18 +02:00

6.3 KiB

LightRAG Sidecar Deployment Checklist

Pre-Deployment Verification

Local Development (Mac Studio)

  • Python 3.10+ installed
  • PostgreSQL running locally (psql --version)
  • Qdrant running locally (curl http://localhost:6333/health)
  • Ollama running with qwen2.5:14b model (curl http://localhost:11434/api/tags)
  • Clone llm-gateway repo locally
  • Create .env file from .env.example
  • Install Python dependencies: pip install -r requirements.txt
  • Run local database init: python scripts/init_db.py
  • Start sidecar: uvicorn app.main:app --reload
  • Test health endpoint: curl http://localhost:3140/api/kg/health
  • Test query endpoint with test document

Erik Server Deployment

Step 1: SSH Access

ssh erik@82.165.222.127
# or from local network: ssh erik@192.168.178.82

Step 2: Copy Files

# On local machine
scp -r packages/lightrag-sidecar/ erik@192.168.178.82:/opt/llm-gateway/packages/

# Or via rsync for large directories
rsync -avz packages/lightrag-sidecar/ erik@192.168.178.82:/opt/llm-gateway/packages/lightrag-sidecar/

Step 3: Setup Python Environment on Erik

cd /opt/llm-gateway/packages/lightrag-sidecar

# Create virtual environment
python3 -m venv venv
source venv/bin/activate

# Install dependencies
pip install --upgrade pip
pip install -r requirements.txt

# Verify installations
python -c "import fastapi, sqlalchemy, sentence_transformers; print('OK')"

Step 4: Setup PostgreSQL on Erik

# Create database and user
sudo -u postgres psql << EOF
CREATE USER tip_kg WITH PASSWORD 'tip_secure_2026';
CREATE DATABASE tip_lightrag OWNER tip_kg;
GRANT ALL PRIVILEGES ON DATABASE tip_lightrag TO tip_kg;
EOF

# Initialize schema
python scripts/init_db.py

# Verify tables created
sudo -u postgres psql -d tip_lightrag -c "\dt"

Step 5: Setup Qdrant on Erik

# Qdrant should already be running on localhost:6333
# Verify connection
curl http://localhost:6333/health

# Create collections if needed (will be auto-created on first ingest)
# No manual action required

Step 6: Configure PM2

# Copy ecosystem config
cp ecosystem.config.cjs /opt/llm-gateway/

# Start sidecar with PM2
cd /opt/llm-gateway
pm2 start packages/lightrag-sidecar/ecosystem.config.cjs

# Verify running
pm2 status
pm2 logs lightrag-sidecar

Step 7: Setup Log Directories

sudo mkdir -p /var/log/lightrag-sidecar
sudo chown $(whoami):$(whoami) /var/log/lightrag-sidecar

Step 8: Configure Firewall (if needed)

# Allow port 3140 from local network
sudo ufw allow from 192.168.178.0/24 to any port 3140
# Or specific IP
sudo ufw allow from 192.168.178.213 to any port 3140

Step 9: Health Check on Erik

# SSH into Erik
curl http://localhost:3140/api/kg/health

# From local machine
curl http://192.168.178.82:3140/api/kg/health

Step 10: Bootstrap with TIP Data

# Set sidecar URL
export LIGHTRAG_SIDECAR_URL=http://localhost:3140

# Run bootstrap
python scripts/bootstrap_tip_data.py

# Monitor ingestion
pm2 logs lightrag-sidecar | grep "Job"

Post-Deployment Verification

Test Endpoints

# Health check
curl http://192.168.178.82:3140/api/kg/health

# Status
curl http://192.168.178.82:3140/api/kg/status

# Example query
curl -X POST http://192.168.178.82:3140/api/kg/query \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What 400G transceivers work with Cisco?",
    "domain": "transceiver",
    "top_k": 5
  }'

# List evaluation datasets
curl http://192.168.178.82:3140/api/kg/eval/datasets

Verify Database

# Connect to PostgreSQL on Erik
psql -h localhost -U tip_kg -d tip_lightrag

# Check tables
\dt

# Check document count
SELECT COUNT(*) FROM documents;

# Check entities
SELECT COUNT(*) FROM entities;

# Check collection in Qdrant
curl http://localhost:6333/api/collections

Monitoring

# Watch logs in real-time
pm2 logs lightrag-sidecar --lines 100 --follow

# Check PM2 process
pm2 show lightrag-sidecar

# Memory usage
pm2 monit

Troubleshooting

Connection Issues

Problem: Cannot reach sidecar from local machine

# Check if service is running
pm2 status

# Check if port is listening
ss -tulpn | grep 3140

# Check firewall
sudo ufw status

Solution:

# Restart service
pm2 restart lightrag-sidecar

# Check logs
pm2 logs lightrag-sidecar

Database Issues

Problem: Database connection error

# Verify PostgreSQL is running
sudo systemctl status postgresql

# Check connection string
grep DATABASE_URL ecosystem.config.cjs

# Test connection
psql -h localhost -U tip_kg -d tip_lightrag -c "SELECT 1"

Ollama Issues

Problem: Entity extraction timeouts

# Check Ollama status
curl http://192.168.178.213:11434/api/tags

# Check if model is loaded
ollama list

# Load model if missing
ollama pull qwen2.5:14b

Qdrant Issues

Problem: Vector search not working

# Check Qdrant health
curl http://localhost:6333/health

# List collections
curl http://localhost:6333/api/collections

# Clear collection if corrupted
curl -X DELETE http://localhost:6333/api/collections/documents_transceiver

Rollback

If deployment fails:

# Stop service
pm2 stop lightrag-sidecar

# Revert code
cd /opt/llm-gateway/packages/lightrag-sidecar
git checkout HEAD~1

# Clear problematic data
psql -U tip_kg -d tip_lightrag -c "TRUNCATE documents, entities, relations CASCADE;"

# Restart
pm2 restart lightrag-sidecar

Performance Tuning

Database Connection Pool

DB_POOL_SIZE=10  # Increase for higher concurrency

Worker Threads

# In ecosystem.config.cjs
args: 'app.main:app --host 0.0.0.0 --port 3140 --workers 4'  # Increase from 2

Batch Size

INGEST_BATCH_SIZE=20  # Larger batches = faster ingestion but more memory

Embedding Cache

Consider caching bge-m3 embeddings to reduce recomputation.

Success Criteria

  • Service starts without errors (pm2 status shows "online")
  • Health check passes all dependencies (postgresql, qdrant, ollama)
  • Sample query returns results in <500ms
  • Can ingest documents and see entities extracted
  • Evaluation metrics calculate correctly
  • Logs show no ERROR level messages
  • Memory usage stays under 1GB
  • Database contains ≥100 documents after bootstrap