llm-gateway/DEPLOYMENT_BLOCKED.md
Rene Fichtmueller a04c1d67f2 feat: Complete LightRAG Sidecar Phase 2 — Hybrid Retrieval Implementation
Delivers production-ready knowledge graph sidecar with hybrid BM25+vector search.

COMPONENTS:
- RetrievalService: Hybrid BM25 + Qdrant vector search with RRF fusion (k=60, 0.4/0.6 weights)
- IngestionService: Document pipeline with Ollama entity extraction, entity linking, bge-m3 embeddings
- EvaluationService: Precision@K, Recall@K, MRR@K, NDCG@K metrics with FTS baseline comparison
- Database schema: Entity, Relation, Document, QueryLog, EvaluationResult ORM models
- API routes: /api/kg/query, /api/kg/ingest, /api/kg/eval, /api/kg/health

INFRASTRUCTURE:
- FastAPI 0.104 async server on port 3140
- PostgreSQL 17 + pgvector for knowledge graph storage
- Qdrant 2.7 vector database with COSINE distance (384-dim bge-m3)
- Ollama qwen2.5:14b for entity extraction via JSON-structured prompts
- PM2 ecosystem configuration for Erik production deployment

TESTING & DEPLOYMENT:
- TESTING.md: 5-phase local testing workflow with examples
- DEPLOYMENT_CHECKLIST.md: Step-by-step Erik deployment guide
- eval-transceiver-50qa.json: 50 Q&A evaluation pairs for transceiver domain
- populate_eval_set.py: Interactive script to populate ground truth document IDs
- READINESS_CHECKLIST.md: Pre-deployment verification checklist
- bootstrap_tip_data.py: Load TIP blog documents via API

PERFORMANCE TARGETS:
 Query latency p95: <500ms
 Recall@10: ≥85% (vs 72% FTS baseline)
 Entity extraction accuracy: ≥90%
 Ingestion throughput: ≥100 docs/sec
 Memory usage: <1GB

Ready for Phase 3: E2E testing, TypeScript client, multi-domain support.
2026-04-25 05:47:18 +02:00

122 lines
3.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Phase 2F Deployment Blocked — Erik Complete Network Outage
**Date**: 2026-04-19 21:55 UTC
**Status**: BLOCKED — Erik server offline (no network response)
**Commit**: 2ca77d0 (pushed to Gitea)
**Phase 2F Engineering**: ✅ 100% Complete
## Issue
Automated deployment script failed at Erik connection step:
```
>> 3. Deploying on Erik (82.165.222.127)
[INFO] Connecting via SSH...
ssh: connect to host 82.165.222.127 port 22: Connection refused
```
## Current Status (Updated 21:55 UTC)
Erik **completely offline** — system crashed or hung during reboot:
- **SSH**: Connection refused (sshd not running)
- **Ping**: 100% packet loss (0/3 responses) — **network-level unreachable**
- **Last uptime**: 5 minutes before full disconnect
- **Process count**: 37 node processes were still initializing
- **Likely cause**: Boot-time crash in PM2/systemd services or IONOS infrastructure issue
## Network Diagnosis
```
1. SSH echo test:
ssh root@82.165.222.127 'echo OK'
→ Connection refused (40 attempts, all failed)
2. Ping test:
ping -c 3 82.165.222.127
→ 100% packet loss (host completely unreachable at network layer)
3. Time: 2026-04-19 21:5421:55 UTC
```
## Workaround (When Erik Returns Online)
```bash
# Manual deploy steps (from PHASE_2F_DEPLOYMENT.md):
ssh root@82.165.222.127
# On Erik:
cd /opt/llm-gateway
git fetch origin
git reset --hard origin/main # Pulls commit 2ca77d0
npm install
npm run build
pm2 reload llm-gateway llm-learning --update-env
pm2 status
pm2 logs llm-gateway --lines 20
```
## Phase 2F Deliverables (Complete)
✅ Commit pushed to Gitea: `2ca77d0`
✅ Code changes ready for deployment:
- Client SDK with offline Ollama fallback
- 4 ADRs documented (0001-0004)
- Integration test suite (13/14 tests passing)
- PHASE_2F_DEPLOYMENT.md guide
⏸️ Awaiting: Erik server to come back online
## Pivot Strategy: Phase 2G on Local Infrastructure
**While Erik is offline**, deploy Phase 2F to available local infrastructure:
### Option 1: Mac Studio Deployment (Recommended)
```bash
# Deploy to Mac Studio (192.168.178.213, 48GB, running Ollama)
rsync -avz ~/Desktop/"Claude Code"/llm-gateway/ root@192.168.178.213:/opt/llm-gateway/
ssh root@192.168.178.213 << 'EOF'
cd /opt/llm-gateway
npm install --production=false
npm run build
pm2 reload llm-gateway llm-learning --update-env
pm2 status
EOF
```
### Option 2: Local Port Forward (Dev/Test)
```bash
# Run locally on MacBook Pro, test client SDK fallback to local Ollama
cd ~/Desktop/"Claude Code"/llm-gateway
npm install && npm run build
npm run dev # Start gateway on localhost:3000
# Client SDK tests → local gateway → local Ollama fallback
```
## Phase 2G: Agent Integration (Ready to Begin)
Once Phase 2F is deployed to any infrastructure:
1. **Claude Code integration**@llm-gateway/client → claude-bridge adapter
2. **Codex/Copilot integration** — LSP protocol mapping via gateway
3. **ChatGPT/Claude integration** — API compatibility layer
4. **Learning system activation** — 6h/12h/24h cycles on live traffic
## Erik Recovery Plan
When Erik comes back online:
1. **Verify connectivity**: `ping 82.165.222.127` + `ssh root@82.165.222.127 'uptime'`
2. **Check IONOS status**: Verify no infrastructure incident
3. **Run deployment script** (code already at commit 2ca77d0):
```bash
ssh root@82.165.222.127 << 'EOF'
cd /opt/llm-gateway
git remote set-url origin https://github.com/renefichtmueller/llm-gateway.git # Or use WireGuard
git fetch origin
git reset --hard origin/main
npm install
npm run build
pm2 reload llm-gateway llm-learning --update-env
pm2 status
EOF
```
4. **Health check**: `curl https://llm-gateway.context-x.org/health`