llm-gateway/packages/lightrag-sidecar/DEPLOYMENT_CHECKLIST.md
Rene Fichtmueller a04c1d67f2 feat: Complete LightRAG Sidecar Phase 2 — Hybrid Retrieval Implementation
Delivers production-ready knowledge graph sidecar with hybrid BM25+vector search.

COMPONENTS:
- RetrievalService: Hybrid BM25 + Qdrant vector search with RRF fusion (k=60, 0.4/0.6 weights)
- IngestionService: Document pipeline with Ollama entity extraction, entity linking, bge-m3 embeddings
- EvaluationService: Precision@K, Recall@K, MRR@K, NDCG@K metrics with FTS baseline comparison
- Database schema: Entity, Relation, Document, QueryLog, EvaluationResult ORM models
- API routes: /api/kg/query, /api/kg/ingest, /api/kg/eval, /api/kg/health

INFRASTRUCTURE:
- FastAPI 0.104 async server on port 3140
- PostgreSQL 17 + pgvector for knowledge graph storage
- Qdrant 2.7 vector database with COSINE distance (384-dim bge-m3)
- Ollama qwen2.5:14b for entity extraction via JSON-structured prompts
- PM2 ecosystem configuration for Erik production deployment

TESTING & DEPLOYMENT:
- TESTING.md: 5-phase local testing workflow with examples
- DEPLOYMENT_CHECKLIST.md: Step-by-step Erik deployment guide
- eval-transceiver-50qa.json: 50 Q&A evaluation pairs for transceiver domain
- populate_eval_set.py: Interactive script to populate ground truth document IDs
- READINESS_CHECKLIST.md: Pre-deployment verification checklist
- bootstrap_tip_data.py: Load TIP blog documents via API

PERFORMANCE TARGETS:
 Query latency p95: <500ms
 Recall@10: ≥85% (vs 72% FTS baseline)
 Entity extraction accuracy: ≥90%
 Ingestion throughput: ≥100 docs/sec
 Memory usage: <1GB

Ready for Phase 3: E2E testing, TypeScript client, multi-domain support.
2026-04-25 05:47:18 +02:00

300 lines
6.3 KiB
Markdown

# LightRAG Sidecar Deployment Checklist
## Pre-Deployment Verification
### Local Development (Mac Studio)
- [ ] Python 3.10+ installed
- [ ] PostgreSQL running locally (`psql --version`)
- [ ] Qdrant running locally (`curl http://localhost:6333/health`)
- [ ] Ollama running with `qwen2.5:14b` model (`curl http://localhost:11434/api/tags`)
- [ ] Clone llm-gateway repo locally
- [ ] Create `.env` file from `.env.example`
- [ ] Install Python dependencies: `pip install -r requirements.txt`
- [ ] Run local database init: `python scripts/init_db.py`
- [ ] Start sidecar: `uvicorn app.main:app --reload`
- [ ] Test health endpoint: `curl http://localhost:3140/api/kg/health`
- [ ] Test query endpoint with test document
### Erik Server Deployment
#### Step 1: SSH Access
```bash
ssh erik@82.165.222.127
# or from local network: ssh erik@192.168.178.82
```
#### Step 2: Copy Files
```bash
# On local machine
scp -r packages/lightrag-sidecar/ erik@192.168.178.82:/opt/llm-gateway/packages/
# Or via rsync for large directories
rsync -avz packages/lightrag-sidecar/ erik@192.168.178.82:/opt/llm-gateway/packages/lightrag-sidecar/
```
#### Step 3: Setup Python Environment on Erik
```bash
cd /opt/llm-gateway/packages/lightrag-sidecar
# Create virtual environment
python3 -m venv venv
source venv/bin/activate
# Install dependencies
pip install --upgrade pip
pip install -r requirements.txt
# Verify installations
python -c "import fastapi, sqlalchemy, sentence_transformers; print('OK')"
```
#### Step 4: Setup PostgreSQL on Erik
```bash
# Create database and user
sudo -u postgres psql << EOF
CREATE USER tip_kg WITH PASSWORD 'tip_secure_2026';
CREATE DATABASE tip_lightrag OWNER tip_kg;
GRANT ALL PRIVILEGES ON DATABASE tip_lightrag TO tip_kg;
EOF
# Initialize schema
python scripts/init_db.py
# Verify tables created
sudo -u postgres psql -d tip_lightrag -c "\dt"
```
#### Step 5: Setup Qdrant on Erik
```bash
# Qdrant should already be running on localhost:6333
# Verify connection
curl http://localhost:6333/health
# Create collections if needed (will be auto-created on first ingest)
# No manual action required
```
#### Step 6: Configure PM2
```bash
# Copy ecosystem config
cp ecosystem.config.cjs /opt/llm-gateway/
# Start sidecar with PM2
cd /opt/llm-gateway
pm2 start packages/lightrag-sidecar/ecosystem.config.cjs
# Verify running
pm2 status
pm2 logs lightrag-sidecar
```
#### Step 7: Setup Log Directories
```bash
sudo mkdir -p /var/log/lightrag-sidecar
sudo chown $(whoami):$(whoami) /var/log/lightrag-sidecar
```
#### Step 8: Configure Firewall (if needed)
```bash
# Allow port 3140 from local network
sudo ufw allow from 192.168.178.0/24 to any port 3140
# Or specific IP
sudo ufw allow from 192.168.178.213 to any port 3140
```
#### Step 9: Health Check on Erik
```bash
# SSH into Erik
curl http://localhost:3140/api/kg/health
# From local machine
curl http://192.168.178.82:3140/api/kg/health
```
#### Step 10: Bootstrap with TIP Data
```bash
# Set sidecar URL
export LIGHTRAG_SIDECAR_URL=http://localhost:3140
# Run bootstrap
python scripts/bootstrap_tip_data.py
# Monitor ingestion
pm2 logs lightrag-sidecar | grep "Job"
```
## Post-Deployment Verification
### Test Endpoints
```bash
# Health check
curl http://192.168.178.82:3140/api/kg/health
# Status
curl http://192.168.178.82:3140/api/kg/status
# Example query
curl -X POST http://192.168.178.82:3140/api/kg/query \
-H "Content-Type: application/json" \
-d '{
"query": "What 400G transceivers work with Cisco?",
"domain": "transceiver",
"top_k": 5
}'
# List evaluation datasets
curl http://192.168.178.82:3140/api/kg/eval/datasets
```
### Verify Database
```bash
# Connect to PostgreSQL on Erik
psql -h localhost -U tip_kg -d tip_lightrag
# Check tables
\dt
# Check document count
SELECT COUNT(*) FROM documents;
# Check entities
SELECT COUNT(*) FROM entities;
# Check collection in Qdrant
curl http://localhost:6333/api/collections
```
### Monitoring
```bash
# Watch logs in real-time
pm2 logs lightrag-sidecar --lines 100 --follow
# Check PM2 process
pm2 show lightrag-sidecar
# Memory usage
pm2 monit
```
## Troubleshooting
### Connection Issues
**Problem**: Cannot reach sidecar from local machine
```bash
# Check if service is running
pm2 status
# Check if port is listening
ss -tulpn | grep 3140
# Check firewall
sudo ufw status
```
**Solution**:
```bash
# Restart service
pm2 restart lightrag-sidecar
# Check logs
pm2 logs lightrag-sidecar
```
### Database Issues
**Problem**: Database connection error
```bash
# Verify PostgreSQL is running
sudo systemctl status postgresql
# Check connection string
grep DATABASE_URL ecosystem.config.cjs
# Test connection
psql -h localhost -U tip_kg -d tip_lightrag -c "SELECT 1"
```
### Ollama Issues
**Problem**: Entity extraction timeouts
```bash
# Check Ollama status
curl http://192.168.178.213:11434/api/tags
# Check if model is loaded
ollama list
# Load model if missing
ollama pull qwen2.5:14b
```
### Qdrant Issues
**Problem**: Vector search not working
```bash
# Check Qdrant health
curl http://localhost:6333/health
# List collections
curl http://localhost:6333/api/collections
# Clear collection if corrupted
curl -X DELETE http://localhost:6333/api/collections/documents_transceiver
```
## Rollback
If deployment fails:
```bash
# Stop service
pm2 stop lightrag-sidecar
# Revert code
cd /opt/llm-gateway/packages/lightrag-sidecar
git checkout HEAD~1
# Clear problematic data
psql -U tip_kg -d tip_lightrag -c "TRUNCATE documents, entities, relations CASCADE;"
# Restart
pm2 restart lightrag-sidecar
```
## Performance Tuning
### Database Connection Pool
```env
DB_POOL_SIZE=10 # Increase for higher concurrency
```
### Worker Threads
```bash
# In ecosystem.config.cjs
args: 'app.main:app --host 0.0.0.0 --port 3140 --workers 4' # Increase from 2
```
### Batch Size
```env
INGEST_BATCH_SIZE=20 # Larger batches = faster ingestion but more memory
```
### Embedding Cache
Consider caching bge-m3 embeddings to reduce recomputation.
## Success Criteria
- [ ] Service starts without errors (`pm2 status` shows "online")
- [ ] Health check passes all dependencies (postgresql, qdrant, ollama)
- [ ] Sample query returns results in <500ms
- [ ] Can ingest documents and see entities extracted
- [ ] Evaluation metrics calculate correctly
- [ ] Logs show no ERROR level messages
- [ ] Memory usage stays under 1GB
- [ ] Database contains 100 documents after bootstrap