llm-gateway

Author	SHA1	Message	Date
Rene Fichtmueller	200cc7f2dc	fix: Correct Cloudflare tunnel and setup script to use port 3103 The LLM Gateway is configured to run on port 3103 in ecosystem.config.cjs, but the Cloudflare tunnel configuration and setup script were referencing port 3100, causing 502 Bad Gateway errors. Updates: - cloudflare-tunnel.md: Changed tunnel ingress from localhost:3100 to localhost:3103 - setup-erik.sh: Updated health check URL and output messages to port 3103 - This fixes the Cloudflare tunnel connection that was causing public HTTPS access to fail Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>	2026-04-26 21:04:36 +02:00
Rene Fichtmueller	1d4be52c83	fix: only send HSTS header on HTTPS connections, not HTTP The learning process was failing to communicate with the gateway because: 1. Gateway was sending 'Strict-Transport-Security' header on HTTP responses 2. Node.js fetch respects HSTS and upgrades subsequent requests to HTTPS 3. Gateway only has HTTP listener (localhost:3103), no HTTPS 4. Result: SSL 'packet length too long' error on second request attempt Solution: Modified registerHSTSMiddleware to only send HSTS header when the connection is already secure (HTTPS or x-forwarded-proto: https). HTTP connections will not get the HSTS header, preventing the forced upgrade.	2026-04-26 19:01:41 +02:00
Rene Fichtmueller	2814fb50b9	fix: correct DATABASE_URL to point to Erik server (82.165.222.127) instead of localhost	2026-04-26 00:24:20 +02:00
Rene Fichtmueller	4c54a6fa92	refactor: MAGATAMA pipeline code quality audit — all functions <50 lines Complete code quality audit of llm-gateway pipeline modules for MAGATAMA standard compliance (50-line function maximum). All pipeline functions refactored to ensure high cohesion and readability. Pipeline module compliance (verified): ✅ llm-client.ts — Refactored callOllama() (58→26 lines) via helper extraction ✅ instrumented-llm-client.ts — All functions <50 lines (wrapper layer) ✅ router.ts — Refactored routeByScore() (81→32 lines) via delegation ✅ request-scorer.ts — 870-line file, all functions <50 lines ✅ external-providers.ts — All functions <50 lines (49-line max) ✅ post-validator.ts — All validators <50 lines Verified: ✓ npm run build (TypeScript, zero errors) ✓ All 6 pipeline modules independently audited ✓ Production-ready for Erik deployment (PM2 ids 19+20, port 3103) Deployment target: Gitea (192.168.178.196:3000/rene/llm-gateway)	2026-04-25 17:38:11 +02:00
Rene Fichtmueller	128e18b751	feat: integrate GitHub Copilot as third LLM provider via copilot-bridge Add GitHub Copilot API proxy integration to LLM Gateway: * Implement copilot-bridge service: - HTTP wrapper managing copilot-api (GitHub Copilot API proxy) - OpenAI-compatible /v1/chat/completions endpoint (port 3252) - Graceful startup and SIGTERM shutdown handling - Health check endpoint with service diagnostics * Register copilot-bridge in provider fallback chain: - Position: After OpenAI, before free LLM APIs (tier 4) - Rate limit: 60 requests/min (GitHub Copilot API limit) - Models: gpt-4 (reasoning), gpt-3.5-turbo (medium) - Authentication: GitHub Copilot subscription (internal to copilot-api) * Update PM2 ecosystem configuration: - Add copilot-bridge service definition (port 3252) - Configure COPILOT_BRIDGE_URL in gateway environment - Add copilot to LLM_PROVIDERS list * Enhance deployment automation: - Update ensure-bridges.sh with copilot-bridge deployment - Copy service files from repo to /opt/copilot-bridge - Run npm install for copilot-api dependency * Comprehensive documentation: - Expand DEPLOYMENT-BRIDGES.md with copilot-bridge section - Prerequisites: Node.js 20+, GitHub Copilot subscription - Authentication workflow: npm run auth with GitHub OAuth - Troubleshooting: subscription verification, auth cache reset Provider chain now supports: 1. Ollama (local, free) 2. claude-bridge (Claude subscription) 3. openai-bridge (OpenAI subscription) 4. copilot-bridge (GitHub Copilot subscription) ← NEW 5. Free APIs: Cerebras, Groq, Mistral, NVIDIA, Cloudflare Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>	2026-04-25 12:38:30 +02:00
Rene Fichtmueller	7599f33866	feat: integrate OpenAI Codex and ChatGPT as primary LLM providers via subscription - Add openai-bridge service (port 3251) for ChatGPT and Codex integration - Update external-providers.ts with openai and chatgpt provider definitions - Add GPT-4 Turbo, GPT-4, and GPT-3.5 Turbo models to provider registry - Modify getApiKey() to handle bridge provider authentication - Modify getBaseUrl() to construct URLs from env vars - Update ecosystem.config.cjs with OPENAI_BRIDGE_URL and OPENAI_API_KEY config - Add openai-bridge PM2 service configuration (port 3251) - Support both claude-bridge (port 3250) and openai-bridge (port 3251) as subscription services - Extend fallback chain: claude → openai/chatgpt → cerebras → groq → mistral → nvidia → cloudflare Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>	2026-04-25 12:29:55 +02:00
Rene Fichtmueller	590d3797c9	chore: update ecosystem.config.cjs with claude-bridge and fixed ollama URL - CLAUDE_BRIDGE_URL: http://localhost:3250 - CLAUDE_BRIDGE_ENABLED: true - OLLAMA_URL: http://192.168.178.213:11434 (direct IP instead of HTTPS tunnel) - LLM_PROVIDERS: claude,cerebras,groq,mistral,nvidia - Add free LLM API keys (empty, to be filled with actual keys)	2026-04-25 12:19:09 +02:00
Rene Fichtmueller	2ca77d0aee	feat: Phase 2F — Multi-Agent Integration (ADRs + Client Fallback + Tests) - ADR-0001: Multi-Agent Coworking Architecture with LLM Gateway Orchestrator - ADR-0002: Tier Assignment Strategy for Model Selection (cost-first escalation) - ADR-0003: Confidence Gate Thresholds & Learning Cycle Intervals (6h/12h/24h cycles) - ADR-0004: External Provider Fallback Chain Ordering (Cerebras → Groq → Mistral) - Enhanced client SDK: Offline Ollama fallback, health checks, exponential backoff retry - Integration tests: claude-code-integration.test.ts (14 test cases) - PHASE_2F_DEPLOYMENT.md: Pre-deployment checklist, automated deploy, rollback plan - Post-deployment verification procedures for health, client fallback, metrics	2026-04-19 21:39:44 +02:00
Rene Fichtmueller	4c5003f9fc	feat: fix OLLAMA_URL to use Cloudflare tunnel + add 35 prompt templates - Update OLLAMA_URL from 192.168.178.169 to https://ollama.fichtmueller.org - Fix port from 3100 to 3103 (3100 was taken by Docker proxy on Erik) - Fix DATABASE_URL password to llm_secure_2026 - Add GITEA_URL env var for ban list sync - Add 35 prompt templates: TIP (10), EO Global Pulse (8), SwitchBlade (9), PeerCortex (3), internal (3), ShieldX (1), general (1)	2026-04-02 23:00:37 +02:00
Rene Fichtmueller	3a00ff4d33	feat: initial llm-gateway implementation - Complete Fastify gateway with 8-stage pipeline - Circuit breaker (opossum) per model tier - Rate limiting per caller - Ban list validation (EN/DE/auto-detected) - TIP validator (SFF-8024, part numbers, wavelengths) - Prometheus metrics - pg-boss async queue - PostgreSQL audit log + review queue - 9 prompt templates (TIP, LinkedIn, ShieldX) - Learning engine scaffolding - Auto-learning: ban-list, few-shot, routing, prompt optimizer	2026-04-02 22:48:55 +02:00

10 Commits