llm-gateway/packages/codex-lsp-adapter
Rene Fichtmueller a04c1d67f2 feat: Complete LightRAG Sidecar Phase 2 — Hybrid Retrieval Implementation
Delivers production-ready knowledge graph sidecar with hybrid BM25+vector search.

COMPONENTS:
- RetrievalService: Hybrid BM25 + Qdrant vector search with RRF fusion (k=60, 0.4/0.6 weights)
- IngestionService: Document pipeline with Ollama entity extraction, entity linking, bge-m3 embeddings
- EvaluationService: Precision@K, Recall@K, MRR@K, NDCG@K metrics with FTS baseline comparison
- Database schema: Entity, Relation, Document, QueryLog, EvaluationResult ORM models
- API routes: /api/kg/query, /api/kg/ingest, /api/kg/eval, /api/kg/health

INFRASTRUCTURE:
- FastAPI 0.104 async server on port 3140
- PostgreSQL 17 + pgvector for knowledge graph storage
- Qdrant 2.7 vector database with COSINE distance (384-dim bge-m3)
- Ollama qwen2.5:14b for entity extraction via JSON-structured prompts
- PM2 ecosystem configuration for Erik production deployment

TESTING & DEPLOYMENT:
- TESTING.md: 5-phase local testing workflow with examples
- DEPLOYMENT_CHECKLIST.md: Step-by-step Erik deployment guide
- eval-transceiver-50qa.json: 50 Q&A evaluation pairs for transceiver domain
- populate_eval_set.py: Interactive script to populate ground truth document IDs
- READINESS_CHECKLIST.md: Pre-deployment verification checklist
- bootstrap_tip_data.py: Load TIP blog documents via API

PERFORMANCE TARGETS:
 Query latency p95: <500ms
 Recall@10: ≥85% (vs 72% FTS baseline)
 Entity extraction accuracy: ≥90%
 Ingestion throughput: ≥100 docs/sec
 Memory usage: <1GB

Ready for Phase 3: E2E testing, TypeScript client, multi-domain support.
2026-04-25 05:47:18 +02:00
..

Codex LSP Adapter

Language Server Protocol adapter for GitHub Copilot/Microsoft Codex integration with LLM Gateway.

Overview

Implements the Language Server Protocol (LSP) to allow Codex and Copilot plugins to connect to the LLM Gateway. Bridges the gap between LSP's structured protocol and the gateway's completion API.

Installation

npm install @llm-gateway/codex-lsp-adapter

Usage

As a Language Server

# Start the LSP server (listens on stdio)
npx codex-lsp

# Or in Node.js
import CodexLSPAdapter from '@llm-gateway/codex-lsp-adapter'

const adapter = new CodexLSPAdapter()
adapter.start()

VS Code Extension Configuration

{
  "languageServerHangingPercent": 0,
  "languageServers": {
    "codex": {
      "command": "codex-lsp",
      "args": [],
      "languages": [
        "javascript",
        "typescript",
        "python",
        "go",
        "rust"
      ]
    }
  }
}

Features

Implemented

  • Completions (textDocument/completion): Code completion triggered by ., space, or (
  • Hover (textDocument/hover): Hover documentation with code explanation
  • Text Sync: Full document synchronization
  • Execute Commands: codex.explain, codex.refactor, codex.test, codex.fix

Architecture

The adapter translates LSP requests to gateway completions:

LSP Client (Copilot/IDE)
    ↓
CodexLSPAdapter (stdio bridge)
    ↓
LLM Gateway API
    ↓
Model Selection (claude, Ollama, external)

Environment Variables

GATEWAY_URL=https://llm-gateway.context-x.org  # LLM Gateway endpoint
OLLAMA_URL=192.168.178.213:11434              # Local Ollama fallback
AGENT_ID=codex-lsp-server                      # Agent identifier
LOG_LEVEL=debug                                # Logging level

Protocol Details

Supported Capabilities

{
  textDocumentSync: 1,                          // Full document sync
  completionProvider: {
    resolveProvider: true,
    triggerCharacters: ['.', ' ', '(']
  },
  hoverProvider: true,
  definitionProvider: true,
  codeActionProvider: true,
  executeCommandProvider: {
    commands: [
      'codex.explain',
      'codex.refactor',
      'codex.test',
      'codex.fix'
    ]
  }
}

Response Format

Completion items include:

  • label: First line of completion
  • insertText: Full completion text
  • documentation: Model name and confidence
  • detail: Source (Gateway vs Ollama fallback)
  • kind: CompletionItemKind.Snippet

Testing

npm test

Tests cover:

  • LSP initialization and shutdown
  • Completion requests with various triggers
  • Hover information extraction
  • Error handling and fallback behavior
  • Confidence score reporting

Troubleshooting

Server not connecting

  1. Check if LSP server is running: lsof -i :protocol
  2. Verify gateway is accessible: curl https://llm-gateway.context-x.org/health
  3. Check logs: LOG_LEVEL=debug codex-lsp

Slow completions

  1. Reduce maxTokens in completion requests
  2. Check gateway latency: curl -w "@curl-format.txt" https://llm-gateway.context-x.org/health
  3. Verify Ollama is running if using fallback

Poor suggestion quality

  1. Adjust temperature/top_p in gateway requests
  2. Check model selection (may be using fallback)
  3. Provide more context in completion requests

Performance

Typical latencies:

  • Gateway mode: 100-500ms (depends on model)
  • Ollama fallback: 200-2000ms (depends on hardware)
  • Timeout: 30s (configurable)

Security

  • LSP communicates over stdio (no network exposure)
  • Gateway API calls use configured authentication
  • Ollama fallback is local-only by default
  • No credentials stored in LSP adapter