71 Commits

Author SHA1 Message Date
Rene Fichtmueller
c53e0d2165 docs: rename handovers to human-friendly "Handover 17.05.2026 - <Typ>.md"
HANDOVER-2026-05-17-pointer.md → Handover 17.05.2026 - Gateway.md
  HANDOVER-AGENTS-2026-05-17.md  → Handover 17.05.2026 - Agents Pointer.md

Cross-references in beiden Files aktualisiert auf neue magatama-Filenames.
2026-05-17 16:44:59 +02:00
Rene Fichtmueller
a77995abd3 docs: 2026-05-17 agent handover pointer for Codex + Claude Code
Points at magatama/HANDOVER-AGENTS-2026-05-17.md as the canonical
cross-project agent handover. Keeps the repo-local TL;DR (Layer-3 status,
PM2 env-reload caveat, smoke test, rollback, /proc env verification) so an
agent can act in this repo without leaving it.
2026-05-17 16:18:35 +02:00
Rene Fichtmueller
3f8abc7152 docs: 2026-05-17 handover pointer
Today's only LLM-gateway change was an env edit on Erik:
  INJECTION_DEFENSE_MODE: block → llm_judge
  LLM_JUDGE_MODEL: magatama-coder:judge-r1 → qwen2.5:14b

Backup .bak-<ts>-pre-mode-switch + .bak-<ts>-pre-qwen-judge on Erik.

Master handover lives in the magatama repo. This file is a pointer + smoke
test + rollback recipe for the Layer-3 activation specifically. Includes
the /proc/PID/environ verification step (because pm2 env shows cached
ecosystem.config.js values, not the actual node-process env from the
durable start-with-env.sh wrapper).
2026-05-17 15:20:26 +02:00
Rene Fichtmueller
aa5911bfdf sec(gateway): start-with-env.sh shell wrapper — durable env fix for PM2 quirk
Recurring problem: PM2 ecosystem env vars get dropped on KeepAlive
auto-restart. Has bitten us 3× in one session — defense silently turns
OFF without visible cause.

Fix: PM2 script changed from `./packages/gateway/dist/server.js` to
`./start-with-env.sh` which:
  set -a; source .env.defense; source .env; set +a
  exec node packages/gateway/dist/server.js

Defense env now persists across ANY restart mechanism (manual reload,
KeepAlive crash-restart, pm2 resurrect, system reboot, ...) because
it's loaded at the shell level on every process spawn — independent of
PM2's internal env state.

Verified end-to-end:
  - 4 smoke tests (Layer-1 EN/FR, Layer-2 Roleplay, legit) → all pass
  - kill -9 → KeepAlive respawns → env STILL present → injection STILL
    blocks (HTTP 422)

.env.defense lives at /opt/llm-gateway/.env.defense (chmod 600, not in repo).
.env.defense.example added to repo as template.
2026-05-17 00:51:51 +02:00
Rene Fichtmueller
c731900a90 sec(gateway): Layer-3 llm_judge model now configurable via LLM_JUDGE_MODEL env
Was hardcoded to qwen2.5:3b. Now reads from process.env LLM_JUDGE_MODEL
with qwen2.5:3b fallback.

Production env updated to magatama-coder:judge-r1 — a snapshot of the
magatamallm post-chunk-4 LoRA adapter exported via train.py --export-only.
Chunk-4 picked because it had the best val_loss (0.861) of the 5 balanced
chunks; chunk-5 spiked back to val=2.531.

Sanity test on the new judge model:
  injection prompt -> "INFORMATIONAL"  (not the strict INJECTION word
                                         we'd want — judge needs Phase-2
                                         dedicated fine-tune on binary
                                         classification format)
  safe prompt      -> "SAFE"           (correct)

Implication: INJECTION_DEFENSE_MODE is staying at 'block' for now —
switching to 'llm_judge' mode with this provisional judge would actually
weaken defense because magatamallm's training tilts toward operator-task
output ("here's the fix") rather than binary INJECTION/SAFE classification.

Follow-up (Phase 2): train a dedicated `magatama-judge` model — small base
(Qwen 2.5:1.5b or Phi-3-mini), trained purely on injection-classification
SFT pairs extracted from our existing:
  - llm-security-prompt-injection-2026-05-12.train.jsonl
  - pulso-magatama-injection-guard-2026-05-13.train.jsonl
  - guard-exposure-firewall-verified-2026-05-16.train.jsonl
  - jailbreak-corpus-candidates.jsonl (L1B3RT4S gaps)
  - benign samples from train.jsonl labeled SAFE

Architecture rationale: separation of concerns. Even if attacker manipulates
the primary backbone model, judge stays independent. ~5-10k pairs should
be enough for a focused 1.5B classifier. Training ~2-3h on Mac Studio MPS.
2026-05-16 23:36:26 +02:00
Rene Fichtmueller
f399999e62 sec(gateway): Layer-2 ML classifier — Prompt-Guard sidecar integration
Adds a second defense layer between Layer-1 regex (62 patterns) and the
existing Layer-3 llm_judge. Calls a FastAPI sidecar running on the Mac
Studio (port 9091, MPS) that wraps protectai/deberta-v3-base-prompt-
injection-v2 — public model, no auth needed, ~50-400ms inference.

modules/prompt-guard-client.ts:
  - callPromptGuard(input)        opportunistic, never throws
  - isPromptGuardConfigured()     true if PROMPT_GUARD_URL is set
  - getPromptGuardThreshold()     default 0.85
  - getPromptGuardMinLen()        default 16 chars (skip tiny inputs)

routes/completion.ts:
  - New Layer-2 block between regex scan and llm_judge: when Layer-1
    didn't detect and input is long enough, ask the sidecar. If sidecar
    returns INJECTION with score >= threshold, return HTTP 422 with
    error.prompt_guard payload (score + latency).
  - Fail-open: sidecar timeout/error logs a warning and the request
    falls through to llm_judge / cache / model — never blocks legitimate
    traffic due to sidecar issues.

Env (set in ecosystem.config.js):
  PROMPT_GUARD_URL       http://192.168.178.213:9091
  PROMPT_GUARD_THRESHOLD 0.70  (lowered from 0.85 after empirical testing)
  PROMPT_GUARD_TIMEOUT   1500 ms

Sidecar code lives at:
  ~/magatama-llm/prompt-guard-sidecar/server.py  (Mac Studio)
  launched via ~/Library/LaunchAgents/org.fichtmueller.prompt-guard-sidecar.plist

Smoke tests after deploy:
  Layer-1 caught: German "ignoriere..."          -> HTTP 422
  Layer-2 caught: English "pretend no restrict.."-> HTTP 422 (pg_score 0.9999)
  Layer-2 caught: Bangla-romanized               -> HTTP 422 (Layer-1 actually)
  Benign:        "Explain DNS in 2 sentences"    -> HTTP 200
2026-05-16 23:14:16 +02:00
Rene Fichtmueller
6f5dd81d7a sec(gateway): +15 languages + non-Latin script detector (62 patterns total)
Closes the multilingual bypass gap. Previously covered EN/DE/FR/ES/IT/RU/ZH/JA.
Now also: Bangla, Hindi, Arabic, Hebrew, Persian, Turkish, Vietnamese, Thai,
Korean, Polish, Dutch, Indonesian, Tagalog, Swahili.

Plus a universal non-Latin-script soft-flag pattern (severity=medium) that
catches ≥20 chars of Arabic/Bengali/Devanagari/Hebrew/Thai/Hangul/Han/
Hiragana/Katakana/Cyrillic/Tamil/Telugu/Gujarati/Gurmukhi/Myanmar/Khmer/
Lao/Tibetan/Georgian/Armenian/Sinhala — surfaces in scan result without
auto-blocking, so legitimate non-Latin prompts pass while the operator
can route them to llm_judge for deep inspection.

Pattern-engineering notes:
  - Devanagari / Bengali / Hebrew need optional matra/suffix tolerance
  - Turkish needs \p{L} instead of \w because ı/ş/ç fall outside ASCII \w
  - Persian (SOV) needs both VSO and SOV order alternation
  - Hebrew needs מ/ב/כ/ל preposition prefix tolerance
  - Tagalog needs optional ang/sa article between verb and noun

Smoke-tested 14/14 languages → all HTTP 422 blocked.
Negative-tested 3 benign non-Latin prompts (jp-weather, ar-greeting,
th-thanks) → all HTTP 200 pass. Zero false positives.

Total active patterns: 62 across 6 categories.
2026-05-16 23:02:01 +02:00
Rene Fichtmueller
ac887ab052 sec(gateway): commit injection/output defense modules to git (44 patterns)
These modules have been running in production on Erik but were never
committed to the repo. Recovering + extending after CVE-2026-45321 review.

injection-defense.ts: 44 patterns across 6 categories
  jailbreak (15)    — ignore/disregard/override + DAN/AIM/EvilBOT/Omega
                      personas + fiction-framing + goal-hijack + 5 langs
                      (en, de, fr, es, it, ru, zh, ja) + DAN-style
                      "do anything now"
  role_bypass (3)   — DAN persona, new-system-prompt, pretend-rolemix
  system_prompt_leak (5) — direct reveal, repeat verbatim, starts-with,
                      debug/admin mode invocation, translate-system
  indirect (8)      — fake role tags, embedded user msg, instruction-in-data,
                      ChatML/INST control-token smuggle, RAG-poisoning,
                      Unicode tag chars (E0000-E007F), zero-width chars,
                      fake-section-header override
  exfiltration (6)  — markdown-image, send-data-to, base64-instruction,
                      DNS-exfil, webhook canaries, templated image URLs
  policy (2)        — no-refusal, illegal-content-demand

Activated via INJECTION_DEFENSE_MODE=block in ecosystem.config.js (also
fixed: was overridden by stale pm2 set module_conf.json entry).

Smoke tests:
  - "Ignore all previous instructions"  → 422 blocked
  - "You are now AIM"                   → 422 blocked
  - "Ignorez les instructions"          → 422 blocked
  - "What is 2+2?"                      → 200 passes

output-defense.ts: existing stream-time output filter, kept as-is.
2026-05-16 22:55:08 +02:00
Rene Fichtmueller
b15b8da963 sync: record gateway final hardening 2026-05-12 23:31:02 +02:00
Rene Fichtmueller
c7491406d1 sync: note claude tool adapter 2026-05-12 23:09:58 +02:00
Rene Fichtmueller
ee9c1715ae sync: note claude code gateway fix 2026-05-12 22:56:24 +02:00
Rene Fichtmueller
ebafb99645 sync: note claude alias correction 2026-05-12 22:20:04 +02:00
Rene Fichtmueller
9027484e3d sync: record secure gateway tracking setup 2026-05-12 22:17:33 +02:00
Rene Fichtmueller
c4056701b3 sync: record gateway health check 2026-05-12 21:42:43 +02:00
Rene Fichtmueller
5afc79ea52 fix(gateway): localhost exempt from HTTPS redirect; magatama-infra-health routing
- tls-config.ts: skip HTTP→HTTPS redirect for localhost/127.0.0.1 callers
  so internal services (infra-health, fix-engine) can call via plain HTTP
- routing-rules.yaml: add magatama-infra-health + infra-health to
  ctx_health_diagnose allowed callers; add qwen2.5:3b to fallback chain
2026-05-09 10:33:07 +02:00
Rene Fichtmueller
09165b9bf7 feat: restore workbench v1 and publish wired v2 2026-05-03 09:53:40 +02:00
Rene Fichtmueller
060b846d9b feat: publish llm gateway v2 dashboard alongside restored workbench 2026-05-01 17:43:32 +02:00
Rene Fichtmueller
e272105bcf sync: add chat handoff + context scaffolding for Codex integration (2026-04-29) 2026-04-29 22:48:23 +02:00
Rene Fichtmueller
91384dbb2a fix: SSE stream endpoint with proper HTTP/2 stream handling and heartbeat
- Fixed /api/stream/requests endpoint HTTP/2 INTERNAL_ERROR
- Use reply.raw.writeHead() instead of Fastify headers API for SSE
- Added 30s heartbeat to keep connection alive
- Proper event format with 'event:' and 'data:' fields
- Comprehensive error handling and cleanup on disconnect
- Mirrors working pattern from /api/stream/costs endpoint
- Resolves dashboard perpetual 'Loading...' state
2026-04-26 23:52:13 +02:00
Rene Fichtmueller
200cc7f2dc fix: Correct Cloudflare tunnel and setup script to use port 3103
The LLM Gateway is configured to run on port 3103 in ecosystem.config.cjs,
but the Cloudflare tunnel configuration and setup script were referencing port
3100, causing 502 Bad Gateway errors.

Updates:
- cloudflare-tunnel.md: Changed tunnel ingress from localhost:3100 to localhost:3103
- setup-erik.sh: Updated health check URL and output messages to port 3103
- This fixes the Cloudflare tunnel connection that was causing public HTTPS access to fail

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-04-26 21:04:36 +02:00
Rene Fichtmueller
255bd90e7e fix: add missing jose dependency for JWT validation 2026-04-26 20:45:05 +02:00
Rene Fichtmueller
d614795545 fix: SQL errors in learning engine for best model selection 2026-04-26 20:42:40 +02:00
Rene Fichtmueller
9808184384 fix: remove non-existent @llm-gateway/types dependency
The @llm-gateway/types package was referenced in dependencies but never
defined or imported. Removing to fix npm installation failures.
2026-04-26 20:39:41 +02:00
Rene Fichtmueller
93bbb44bf0 fix: remove broken @shieldx/core dependency
The @shieldx/core dependency was referenced at an invalid file path and is
not actually used in the codebase (import is commented as TODO). Removing
this dependency resolves npm installation failures on deployment.
2026-04-26 20:36:21 +02:00
Rene Fichtmueller
1d4be52c83 fix: only send HSTS header on HTTPS connections, not HTTP
The learning process was failing to communicate with the gateway because:
1. Gateway was sending 'Strict-Transport-Security' header on HTTP responses
2. Node.js fetch respects HSTS and upgrades subsequent requests to HTTPS
3. Gateway only has HTTP listener (localhost:3103), no HTTPS
4. Result: SSL 'packet length too long' error on second request attempt

Solution: Modified registerHSTSMiddleware to only send HSTS header when
the connection is already secure (HTTPS or x-forwarded-proto: https).
HTTP connections will not get the HSTS header, preventing the forced upgrade.
2026-04-26 19:01:41 +02:00
Rene Fichtmueller
ff090de82b fix: Update request logging to use request_tracking table instead of dashboard_request_log 2026-04-26 00:42:58 +02:00
Rene Fichtmueller
2814fb50b9 fix: correct DATABASE_URL to point to Erik server (82.165.222.127) instead of localhost 2026-04-26 00:24:20 +02:00
Rene Fichtmueller
4c54a6fa92 refactor: MAGATAMA pipeline code quality audit — all functions <50 lines
Complete code quality audit of llm-gateway pipeline modules for MAGATAMA standard compliance (50-line function maximum). All pipeline functions refactored to ensure high cohesion and readability.

Pipeline module compliance (verified):
 llm-client.ts — Refactored callOllama() (58→26 lines) via helper extraction
 instrumented-llm-client.ts — All functions <50 lines (wrapper layer)
 router.ts — Refactored routeByScore() (81→32 lines) via delegation
 request-scorer.ts — 870-line file, all functions <50 lines
 external-providers.ts — All functions <50 lines (49-line max)
 post-validator.ts — All validators <50 lines

Verified:
✓ npm run build (TypeScript, zero errors)
✓ All 6 pipeline modules independently audited
✓ Production-ready for Erik deployment (PM2 ids 19+20, port 3103)

Deployment target: Gitea (192.168.178.196:3000/rene/llm-gateway)
2026-04-25 17:38:11 +02:00
Rene Fichtmueller
b7b85eccba fix: correct copilot-api dependency version to 0.7.0 (published on npm) 2026-04-25 12:41:30 +02:00
Rene Fichtmueller
410bed1f1e docs: update integration status to reflect full provider integration (OpenAI + Copilot) 2026-04-25 12:40:17 +02:00
Rene Fichtmueller
128e18b751 feat: integrate GitHub Copilot as third LLM provider via copilot-bridge
Add GitHub Copilot API proxy integration to LLM Gateway:

* Implement copilot-bridge service:
  - HTTP wrapper managing copilot-api (GitHub Copilot API proxy)
  - OpenAI-compatible /v1/chat/completions endpoint (port 3252)
  - Graceful startup and SIGTERM shutdown handling
  - Health check endpoint with service diagnostics

* Register copilot-bridge in provider fallback chain:
  - Position: After OpenAI, before free LLM APIs (tier 4)
  - Rate limit: 60 requests/min (GitHub Copilot API limit)
  - Models: gpt-4 (reasoning), gpt-3.5-turbo (medium)
  - Authentication: GitHub Copilot subscription (internal to copilot-api)

* Update PM2 ecosystem configuration:
  - Add copilot-bridge service definition (port 3252)
  - Configure COPILOT_BRIDGE_URL in gateway environment
  - Add copilot to LLM_PROVIDERS list

* Enhance deployment automation:
  - Update ensure-bridges.sh with copilot-bridge deployment
  - Copy service files from repo to /opt/copilot-bridge
  - Run npm install for copilot-api dependency

* Comprehensive documentation:
  - Expand DEPLOYMENT-BRIDGES.md with copilot-bridge section
  - Prerequisites: Node.js 20+, GitHub Copilot subscription
  - Authentication workflow: npm run auth with GitHub OAuth
  - Troubleshooting: subscription verification, auth cache reset

Provider chain now supports:
1. Ollama (local, free)
2. claude-bridge (Claude subscription)
3. openai-bridge (OpenAI subscription)
4. copilot-bridge (GitHub Copilot subscription) ← NEW
5. Free APIs: Cerebras, Groq, Mistral, NVIDIA, Cloudflare

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-04-25 12:38:30 +02:00
Rene Fichtmueller
afe3597311 feat: add automated bridge deployment script and comprehensive deployment guide
- ensure-bridges.sh: Idempotent startup script that deploys openai-bridge if not present
- DEPLOYMENT-BRIDGES.md: Complete deployment guide with setup, configuration, verification, and troubleshooting steps
- Enables autonomous deployment of ChatGPT/Codex bridge service on Erik
- Supports both automatic and manual setup workflows
2026-04-25 12:32:31 +02:00
Rene Fichtmueller
e128d39818 chore: add openai-bridge deployment script for Erik 2026-04-25 12:31:11 +02:00
Rene Fichtmueller
7599f33866 feat: integrate OpenAI Codex and ChatGPT as primary LLM providers via subscription
- Add openai-bridge service (port 3251) for ChatGPT and Codex integration
- Update external-providers.ts with openai and chatgpt provider definitions
- Add GPT-4 Turbo, GPT-4, and GPT-3.5 Turbo models to provider registry
- Modify getApiKey() to handle bridge provider authentication
- Modify getBaseUrl() to construct URLs from env vars
- Update ecosystem.config.cjs with OPENAI_BRIDGE_URL and OPENAI_API_KEY config
- Add openai-bridge PM2 service configuration (port 3251)
- Support both claude-bridge (port 3250) and openai-bridge (port 3251) as subscription services
- Extend fallback chain: claude → openai/chatgpt → cerebras → groq → mistral → nvidia → cloudflare

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-04-25 12:29:55 +02:00
Rene Fichtmueller
590d3797c9 chore: update ecosystem.config.cjs with claude-bridge and fixed ollama URL
- CLAUDE_BRIDGE_URL: http://localhost:3250
- CLAUDE_BRIDGE_ENABLED: true
- OLLAMA_URL: http://192.168.178.213:11434 (direct IP instead of HTTPS tunnel)
- LLM_PROVIDERS: claude,cerebras,groq,mistral,nvidia
- Add free LLM API keys (empty, to be filled with actual keys)
2026-04-25 12:19:09 +02:00
Rene Fichtmueller
b34b835b47 feat: integrate claude-bridge as primary LLM provider with fallback chain
- Add claude-bridge provider to external-providers.ts with Claude models (opus, sonnet, haiku)
- Modify getApiKey to handle claude-bridge authentication (CLAUDE_BRIDGE_ENABLED flag)
- Update getBaseUrl to construct URL from CLAUDE_BRIDGE_URL environment variable
- Remove Authorization header for claude-bridge (uses subscription-based auth)
- claude-bridge now first in fallback chain: Claude → Cerebras → Groq → Mistral → NVIDIA → Cloudflare
2026-04-25 12:18:33 +02:00
Rene Fichtmueller
f5e2357f20 docs: Add Phase 2 delivery summary and getting started guides
- PHASE_2_DELIVERY.md: Complete delivery summary with all components
- GETTING_STARTED.md: Quick start guide (40 min end-to-end)
- scripts/verify_local_setup.sh: Local environment verification
2026-04-25 05:48:33 +02:00
Rene Fichtmueller
a04c1d67f2 feat: Complete LightRAG Sidecar Phase 2 — Hybrid Retrieval Implementation
Delivers production-ready knowledge graph sidecar with hybrid BM25+vector search.

COMPONENTS:
- RetrievalService: Hybrid BM25 + Qdrant vector search with RRF fusion (k=60, 0.4/0.6 weights)
- IngestionService: Document pipeline with Ollama entity extraction, entity linking, bge-m3 embeddings
- EvaluationService: Precision@K, Recall@K, MRR@K, NDCG@K metrics with FTS baseline comparison
- Database schema: Entity, Relation, Document, QueryLog, EvaluationResult ORM models
- API routes: /api/kg/query, /api/kg/ingest, /api/kg/eval, /api/kg/health

INFRASTRUCTURE:
- FastAPI 0.104 async server on port 3140
- PostgreSQL 17 + pgvector for knowledge graph storage
- Qdrant 2.7 vector database with COSINE distance (384-dim bge-m3)
- Ollama qwen2.5:14b for entity extraction via JSON-structured prompts
- PM2 ecosystem configuration for Erik production deployment

TESTING & DEPLOYMENT:
- TESTING.md: 5-phase local testing workflow with examples
- DEPLOYMENT_CHECKLIST.md: Step-by-step Erik deployment guide
- eval-transceiver-50qa.json: 50 Q&A evaluation pairs for transceiver domain
- populate_eval_set.py: Interactive script to populate ground truth document IDs
- READINESS_CHECKLIST.md: Pre-deployment verification checklist
- bootstrap_tip_data.py: Load TIP blog documents via API

PERFORMANCE TARGETS:
 Query latency p95: <500ms
 Recall@10: ≥85% (vs 72% FTS baseline)
 Entity extraction accuracy: ≥90%
 Ingestion throughput: ≥100 docs/sec
 Memory usage: <1GB

Ready for Phase 3: E2E testing, TypeScript client, multi-domain support.
2026-04-25 05:47:18 +02:00
Rene Fichtmueller
282403d34b feat: Implement Phase 2G.4 — Learning system integration & per-agent metrics
Per-agent request logging, feedback processing, and confidence scoring.

- Per-agent metric collection: request_id, model, latency_ms, tokens_in/out, confidence, fallback_used, success
- Agent feedback loop: outcome tracking (success/fallback/timeout/error/user_rejected)
- Confidence scoring: 50% success + 25% quality + 25% satisfaction (per-agent independent of global)
- Cost attribution: Monthly cost report per agent (tokens × model rate)
- SLO monitoring: p50/p95/p99 latencies vs per-agent targets
- Anomaly detection: σ-based latency spikes, success rate drops, confidence degradation
- Full TypeScript types, database schema initialization, comprehensive documentation
2026-04-19 22:22:17 +02:00
Rene Fichtmueller
1d327720d5 feat: Implement Phase 2G.3 — ChatGPT/OpenAI API compatibility adapter
HTTP server providing OpenAI API compatibility for LLM Gateway.

- OpenAI client SDK drop-in replacement (baseURL only change)
- POST /v1/chat/completions endpoint with streaming support
- GET /v1/models for client library discovery
- Automatic model mapping: gpt-4 → qwen2.5:32b, etc.
- Server-Sent Events (SSE) streaming implementation
- Full TypeScript types and comprehensive test suite
- Graceful shutdown handling (SIGTERM/SIGINT)
- Health check endpoint with gateway status
- Performance: Same as gateway (100-500ms with fallback to Ollama)
2026-04-19 22:05:20 +02:00
Rene Fichtmueller
63171645da feat: Implement Phase 2G.2 — Codex/Copilot LSP adapter
Language Server Protocol bridge for GitHub Copilot and Copilot-compatible editors.

- Implements LSP transport layer (vscode-languageserver)
- Completion with trigger characters: '.', ' ', '('
- Hover documentation with model/confidence metadata
- Code action placeholders for explain/refactor/test/fix
- Automatic fallback to local Ollama (192.168.178.213:11434)
- Full TypeScript types and test coverage
- CLI entry point: codex-lsp (stdio transport)
- Performance: Gateway 100-500ms, Ollama 200-2000ms
2026-04-19 22:04:15 +02:00
Rene Fichtmueller
b943bb1d59 feat: Implement Phase 2G.1 — Claude Code IDE bridge
- Create @llm-gateway/claude-code-bridge package
- Support explain, refactor, test, document, fix commands
- Automatic fallback to local Ollama when gateway unavailable
- Health monitoring and confidence tracking
- Comprehensive test suite covering all completion methods
- Follows ADR-0005 agent integration protocol
2026-04-19 22:02:06 +02:00
Rene Fichtmueller
4d7e251322 feat: Add ADR-0005 for Phase 2G agent integration protocol
- Define three-layer integration stack (transport, adapters, protocol)
- JSON-RPC 2.0 over HTTP for unified agent communication
- Support for Claude Code, Codex/Copilot, ChatGPT, Ollama fallback
- Establishes foundation for Phase 2G multi-agent integration
- Decision on authentication, rate limiting, streaming TBD in implementation
2026-04-19 22:01:17 +02:00
Rene Fichtmueller
8e83e5fa6e chore: Document Phase 2F deployment blocker — Erik unreachable (network issue) 2026-04-19 21:41:12 +02:00
Rene Fichtmueller
2ca77d0aee feat: Phase 2F — Multi-Agent Integration (ADRs + Client Fallback + Tests)
- ADR-0001: Multi-Agent Coworking Architecture with LLM Gateway Orchestrator
- ADR-0002: Tier Assignment Strategy for Model Selection (cost-first escalation)
- ADR-0003: Confidence Gate Thresholds & Learning Cycle Intervals (6h/12h/24h cycles)
- ADR-0004: External Provider Fallback Chain Ordering (Cerebras → Groq → Mistral)
- Enhanced client SDK: Offline Ollama fallback, health checks, exponential backoff retry
- Integration tests: claude-code-integration.test.ts (14 test cases)
- PHASE_2F_DEPLOYMENT.md: Pre-deployment checklist, automated deploy, rollback plan
- Post-deployment verification procedures for health, client fallback, metrics
2026-04-19 21:39:44 +02:00
Rene Fichtmueller
c3ab87b167 feat: add fo-blog-v8 training pipeline (Qwen2.5-14B, SFT+DPO)
Full v8 training pipeline for the optical networking blog model:
- train_blog_v8.py: SFT (LoRA r=64, 5 epochs) + DPO (2 epochs) on Qwen2.5-14B-Instruct
  Fixed for trl 1.2.x: SFTConfig instead of TrainingArguments, processing_class= instead
  of tokenizer=, eval_strategy= instead of deprecated evaluation_strategy=
- consolidate_v8_dataset.py: weighted merge of all data sources (820 effective SFT / 235 DPO)
- crawl_v8_sources.py: APNIC/RIPE Labs/potaroo/Cloudflare crawler with balanced div extraction
- process_v6_blogs.py: converts 101 real v6 TIP blog outputs into SFT + DPO pairs
- label_v7_quality.py: Claude-judged quality labels → v8 quality DPO pairs
- parse_real_posts.py: parses blog.fichtmueller.org Ghost CMS HTML → gold SFT records
- run_v8_pipeline.sh: autopilot (consolidate → SFT → DPO → GGUF → Ollama)
- blog-v8-training.yaml: training config reference

Dataset breakdown: 19 real posts ×3 + 196 v7-gen + 28 v6blogs ×2 + 135 external ×1.5
2026-04-19 11:44:09 +02:00
Rene Fichtmueller
79d434434f chore: MAGATAMA deployment state — download in progress, Pi-hole bypassed 2026-04-16 16:35:54 +02:00
Rene Fichtmueller
2fb0992c71 feat: add MAGATAMA まがたま security intelligence model to LLM Gateway
- Add magatama:32b to models.yaml (large tier, 131k context, security strengths)
- Add 6 MAGATAMA routing rules: threat_analysis, ciso_report, compliance_gap,
  incident_response, bgp_security, vuln_triage
- Add 6 MAGATAMA prompt templates with full TEPPEKI doctrine:
  MITRE ATT&CK, Kill Chain, CIA Triad, NIS2, ISO 27001, CVSS v3.1
- Fine-tuned on Qwen2.5-32B-Instruct with 22831 MAGATAMA security samples
  LoRA adapter: r=8, alpha=16
2026-04-16 14:31:17 +02:00
Rene Fichtmueller
132cf835b0 fix: add fixes 049-050 (OPNsense WireGuard FritzBox NAT + VLAN WAN) 2026-04-13 08:20:06 +02:00
Rene Fichtmueller
c50af63389 feat(ctx-health): add proxmox-pvestatd + opnsense-disk health checks
- Add SSH-based health check for pvestatd D-state detection on Proxmox host
  (heal via cgroup move + lock file removal + reset-failed)
- Add SSH-based disk check for OPNsense VM (threshold 75%, auto-cleanup)
- knowledge/fixes.json: add 48 training fixes including post-reboot DNS
  recovery (fix-046), cloudflared DNS-wait boot fix (fix-047), and
  vzdump load-crash scenario with recovery steps (fix-048)
2026-04-13 05:42:24 +02:00