10 Commits

Author SHA1 Message Date
Rene Fichtmueller
200cc7f2dc fix: Correct Cloudflare tunnel and setup script to use port 3103
The LLM Gateway is configured to run on port 3103 in ecosystem.config.cjs,
but the Cloudflare tunnel configuration and setup script were referencing port
3100, causing 502 Bad Gateway errors.

Updates:
- cloudflare-tunnel.md: Changed tunnel ingress from localhost:3100 to localhost:3103
- setup-erik.sh: Updated health check URL and output messages to port 3103
- This fixes the Cloudflare tunnel connection that was causing public HTTPS access to fail

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-04-26 21:04:36 +02:00
Rene Fichtmueller
1d4be52c83 fix: only send HSTS header on HTTPS connections, not HTTP
The learning process was failing to communicate with the gateway because:
1. Gateway was sending 'Strict-Transport-Security' header on HTTP responses
2. Node.js fetch respects HSTS and upgrades subsequent requests to HTTPS
3. Gateway only has HTTP listener (localhost:3103), no HTTPS
4. Result: SSL 'packet length too long' error on second request attempt

Solution: Modified registerHSTSMiddleware to only send HSTS header when
the connection is already secure (HTTPS or x-forwarded-proto: https).
HTTP connections will not get the HSTS header, preventing the forced upgrade.
2026-04-26 19:01:41 +02:00
Rene Fichtmueller
2814fb50b9 fix: correct DATABASE_URL to point to Erik server (82.165.222.127) instead of localhost 2026-04-26 00:24:20 +02:00
Rene Fichtmueller
4c54a6fa92 refactor: MAGATAMA pipeline code quality audit — all functions <50 lines
Complete code quality audit of llm-gateway pipeline modules for MAGATAMA standard compliance (50-line function maximum). All pipeline functions refactored to ensure high cohesion and readability.

Pipeline module compliance (verified):
 llm-client.ts — Refactored callOllama() (58→26 lines) via helper extraction
 instrumented-llm-client.ts — All functions <50 lines (wrapper layer)
 router.ts — Refactored routeByScore() (81→32 lines) via delegation
 request-scorer.ts — 870-line file, all functions <50 lines
 external-providers.ts — All functions <50 lines (49-line max)
 post-validator.ts — All validators <50 lines

Verified:
✓ npm run build (TypeScript, zero errors)
✓ All 6 pipeline modules independently audited
✓ Production-ready for Erik deployment (PM2 ids 19+20, port 3103)

Deployment target: Gitea (192.168.178.196:3000/rene/llm-gateway)
2026-04-25 17:38:11 +02:00
Rene Fichtmueller
128e18b751 feat: integrate GitHub Copilot as third LLM provider via copilot-bridge
Add GitHub Copilot API proxy integration to LLM Gateway:

* Implement copilot-bridge service:
  - HTTP wrapper managing copilot-api (GitHub Copilot API proxy)
  - OpenAI-compatible /v1/chat/completions endpoint (port 3252)
  - Graceful startup and SIGTERM shutdown handling
  - Health check endpoint with service diagnostics

* Register copilot-bridge in provider fallback chain:
  - Position: After OpenAI, before free LLM APIs (tier 4)
  - Rate limit: 60 requests/min (GitHub Copilot API limit)
  - Models: gpt-4 (reasoning), gpt-3.5-turbo (medium)
  - Authentication: GitHub Copilot subscription (internal to copilot-api)

* Update PM2 ecosystem configuration:
  - Add copilot-bridge service definition (port 3252)
  - Configure COPILOT_BRIDGE_URL in gateway environment
  - Add copilot to LLM_PROVIDERS list

* Enhance deployment automation:
  - Update ensure-bridges.sh with copilot-bridge deployment
  - Copy service files from repo to /opt/copilot-bridge
  - Run npm install for copilot-api dependency

* Comprehensive documentation:
  - Expand DEPLOYMENT-BRIDGES.md with copilot-bridge section
  - Prerequisites: Node.js 20+, GitHub Copilot subscription
  - Authentication workflow: npm run auth with GitHub OAuth
  - Troubleshooting: subscription verification, auth cache reset

Provider chain now supports:
1. Ollama (local, free)
2. claude-bridge (Claude subscription)
3. openai-bridge (OpenAI subscription)
4. copilot-bridge (GitHub Copilot subscription) ← NEW
5. Free APIs: Cerebras, Groq, Mistral, NVIDIA, Cloudflare

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-04-25 12:38:30 +02:00
Rene Fichtmueller
7599f33866 feat: integrate OpenAI Codex and ChatGPT as primary LLM providers via subscription
- Add openai-bridge service (port 3251) for ChatGPT and Codex integration
- Update external-providers.ts with openai and chatgpt provider definitions
- Add GPT-4 Turbo, GPT-4, and GPT-3.5 Turbo models to provider registry
- Modify getApiKey() to handle bridge provider authentication
- Modify getBaseUrl() to construct URLs from env vars
- Update ecosystem.config.cjs with OPENAI_BRIDGE_URL and OPENAI_API_KEY config
- Add openai-bridge PM2 service configuration (port 3251)
- Support both claude-bridge (port 3250) and openai-bridge (port 3251) as subscription services
- Extend fallback chain: claude → openai/chatgpt → cerebras → groq → mistral → nvidia → cloudflare

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-04-25 12:29:55 +02:00
Rene Fichtmueller
590d3797c9 chore: update ecosystem.config.cjs with claude-bridge and fixed ollama URL
- CLAUDE_BRIDGE_URL: http://localhost:3250
- CLAUDE_BRIDGE_ENABLED: true
- OLLAMA_URL: http://192.168.178.213:11434 (direct IP instead of HTTPS tunnel)
- LLM_PROVIDERS: claude,cerebras,groq,mistral,nvidia
- Add free LLM API keys (empty, to be filled with actual keys)
2026-04-25 12:19:09 +02:00
Rene Fichtmueller
2ca77d0aee feat: Phase 2F — Multi-Agent Integration (ADRs + Client Fallback + Tests)
- ADR-0001: Multi-Agent Coworking Architecture with LLM Gateway Orchestrator
- ADR-0002: Tier Assignment Strategy for Model Selection (cost-first escalation)
- ADR-0003: Confidence Gate Thresholds & Learning Cycle Intervals (6h/12h/24h cycles)
- ADR-0004: External Provider Fallback Chain Ordering (Cerebras → Groq → Mistral)
- Enhanced client SDK: Offline Ollama fallback, health checks, exponential backoff retry
- Integration tests: claude-code-integration.test.ts (14 test cases)
- PHASE_2F_DEPLOYMENT.md: Pre-deployment checklist, automated deploy, rollback plan
- Post-deployment verification procedures for health, client fallback, metrics
2026-04-19 21:39:44 +02:00
Rene Fichtmueller
4c5003f9fc feat: fix OLLAMA_URL to use Cloudflare tunnel + add 35 prompt templates
- Update OLLAMA_URL from 192.168.178.169 to https://ollama.fichtmueller.org
- Fix port from 3100 to 3103 (3100 was taken by Docker proxy on Erik)
- Fix DATABASE_URL password to llm_secure_2026
- Add GITEA_URL env var for ban list sync
- Add 35 prompt templates: TIP (10), EO Global Pulse (8), SwitchBlade (9),
  PeerCortex (3), internal (3), ShieldX (1), general (1)
2026-04-02 23:00:37 +02:00
Rene Fichtmueller
3a00ff4d33 feat: initial llm-gateway implementation
- Complete Fastify gateway with 8-stage pipeline
- Circuit breaker (opossum) per model tier
- Rate limiting per caller
- Ban list validation (EN/DE/auto-detected)
- TIP validator (SFF-8024, part numbers, wavelengths)
- Prometheus metrics
- pg-boss async queue
- PostgreSQL audit log + review queue
- 9 prompt templates (TIP, LinkedIn, ShieldX)
- Learning engine scaffolding
- Auto-learning: ban-list, few-shot, routing, prompt optimizer
2026-04-02 22:48:55 +02:00