llm-gateway/sync/CURRENT.md
2026-05-12 23:31:02 +02:00

15 KiB

Claude Code Context — 2026-04-29

Last Updated: 2026-04-29 ~20:30 (Session ongoing) Session Type: LLM Gateway / Codex Bridge Handoff Working Directory: /Users/renefichtmueller/Desktop/Claude Code Model: Haiku 4.5 (default), Opus for deep reasoning Context Window: Using lean-ctx MCP for compression


Session Status

Latest Verified State — 2026-05-12 23:30 Europe/Berlin

  • Live hardening and verification completed:

    • GitHub Copilot bridge now binds to loopback by default (127.0.0.1) and reports stable diagnostic health instead of hiding startup/auth failures behind PM2 restarts.
    • The Copilot bridge health now exposes auth_required, host, package, last startup/output, and an explicit warning while COPILOT_API_PACKAGE is still copilot-api@latest.
    • Dashboard Client Coverage now shows bridge provider/runtime state per desktop client, not only local process/install detection.
    • Live /api/dashboard/clients?hours=24 verifies:
      • Codex Desktop / CLI: live, bridge codex ready, callers include codex-cli, codex-live-gateway-check, codex-secure-tunnel-smoke, tokensSaved=4067.
      • Claude Desktop / Claude Code: live, bridge claude-code ready, callers include claude-code-companion, requestCount=28.
      • Microsoft Copilot: local process detected, bridge m365-copilot-bridge remains auth_required until Microsoft Graph/device auth is configured.
      • GitHub Copilot: local process/bridge detected, bridge copilot-bridge remains auth_required until GitHub Copilot device login is completed.
    • Fresh compression proof after deploy:
      • Caller final-repeat-compression-smoke, model qwen2.5:14b.
      • Compression mode ctxlean:verbatim_compact.
      • Tokens 8882 -> 106, saved 8776, savings 98.81%.
    • Gateway public health remains green: /api/dashboard/health returns status=ok, database connected.
  • Operational note:

    • Cloudflare SSH fallback needed explicit Go DNS mode from Codex sandbox: GODEBUG=netdns=go+1 cloudflared access ssh --hostname ssh.context-x.org.
    • Direct SSH to Erik was intermittent/refused during deploy, but Cloudflare SSH with the DNS override completed restart and verification.
  • Companion tool-use adapter added and verified:

    • Anthropic tools are summarized into a strict tool-use adapter instruction for the text backend.
    • OpenAI-style tool_calls or compact JSON tool decisions are converted back to Anthropic tool_use content blocks.
    • Forced tool_choice: {type:"tool"} now returns a valid tool_use block even if the text backend returns an empty response.
    • Streaming tool use emits content_block_start, input_json_delta, content_block_stop, message_delta, and message_stop.
  • Synthetic proof:

    • Non-stream request with read_file returned content[0].type=tool_use, name=read_file, input.path=/tmp/hello.txt.
    • Streaming request returned valid Anthropic SSE tool-use events with partial_json={"path":"/tmp/stream.txt"}.
  • Claude Code text path still works through Companion/Gateway after the tool adapter; latest CLI smoke reached Gateway and dashboard logged claude-code-companion.

  • Remaining quality boundary:

    • Erik /opt/claude-bridge/server.js is text-only (claude --print --output-format text), so native model-driven Anthropic tool parity is still not the same as the hosted Anthropic API.
    • The adapter now supports tool block transport and forced tool calls, but auto tool selection depends on the text backend following the tool JSON instruction.
    • Short exact-answer prompts may still be answered creatively by the subscription bridge; this is provider behavior, not Companion protocol failure.
  • Claude Code full CLI smoke now reaches the local Gateway Companion and public Gateway reliably:

    • Local Companion: 127.0.0.1:11435.
    • Claude env: ANTHROPIC_BASE_URL=http://127.0.0.1:11435, ANTHROPIC_API_KEY=gateway, default Sonnet claude-sonnet-4-6.
    • Verified command returned exact clean result claude-debug10-ok.
    • Dashboard rows show caller claude-code-companion, models claude-sonnet-4-6 and claude-haiku-3, tokens/cost/latency tracked.
  • Fixes applied during verification:

    • Companion clamps Anthropic max_tokens to Gateway limit 16384.
    • Companion emits Anthropic-compatible SSE without double-writing headers.
    • Companion sanitizes OpenAI-style assistant markers and prompt echo before returning to Claude Code.
    • Companion message IDs now include a random suffix to avoid concurrent generate_session_title vs main-request collisions.
    • Gateway live route bypasses response-cache for agentic callers containing claude-code, codex, or copilot; these are still tracked and compression metadata is still recorded.
  • Important boundary:

    • Claude Code text/CLI path is now usable through Gateway and tracked.
    • Full Anthropic tool-use fidelity is still adapter-level, not native Anthropic API parity; current bridge flattens tool requests to text for Gateway routing.
    • Small Claude Code smoke prompts often show compression_mode=none:none because there is no useful token reduction on tiny inputs; larger Codex test already proved ctxlean-rtk savings.
  • Secure bridge architecture is now in place for Gateway-routed subscription access:

    • MacStudio Codex bridge listens on 127.0.0.1:3253.
    • Local M365 bridge listens on 127.0.0.1:3257 but remains auth-required.
    • Cloudflare-Access SSH reverse tunnel exposes only Erik loopback listeners 127.0.0.1:3353 and 127.0.0.1:3357.
    • Gateway live env points CODEX_BRIDGE_URL / OPENAI_CODEX_URL to http://127.0.0.1:3353.
  • End-to-end Codex via Gateway works and is tracked:

    • Caller codex-secure-tunnel-smoke.
    • Model gpt-5.1-codex-mini.
    • Dashboard request row recorded tokens, latency, cost, and compression metadata.
  • New local Codex starts are configured for Gateway:

    • ~/.codex/config.toml default provider llm-gateway, wire_api = "responses", env_key = "LLM_GATEWAY_API_KEY".
    • ~/.zshrc sets OpenAI-compatible Gateway env vars and aliases codex to the Gateway profile.
  • Local Gateway Companion is running on 127.0.0.1:11435 for desktop/CLI clients that need a local endpoint.

    • It forwards OpenAI-compatible calls to https://llm-gateway.context-x.org.
    • It translates Claude/Anthropic /v1/messages text calls to Gateway /v1/chat/completions.
    • Claude Companion smoke with model claude-sonnet-4-6 returned content and was tracked.
  • Claude model alias warning:

    • claude-sonnet-4-1 was stale for current Claude Code bridge behavior and produced empty/failing output.
    • Live Gateway provider metadata was corrected to expose claude-sonnet-4-6.
    • claude-sonnet-4-6, sonnet, or default bridge model works.
  • Remaining auth blockers:

    • GitHub Copilot bridge remains auth_required.
    • M365 Copilot bridge remains auth_required until real Microsoft Graph delegated auth/client config exists.
  • Truth boundary:

    • Gateway can track/compress only requests that enter it before provider execution.
    • Existing native app sessions must be restarted or explicitly configured to use Gateway/Companion.
    • Full Claude Code tool-call translation through Anthropic /v1/messages is not finished; current Companion support is text-compatible and enough for tracking text calls.

Previous Verified State — 2026-05-12

  • Public gateway is reachable:
    • /api/dashboard/health returns ok, database connected.
    • /v1/models returns the configured model list.
    • /v1/chat/completions accepted a live smoke request from caller codex-live-gateway-check and returned gateway-check-ok.
  • Tracking works for requests that actually enter the gateway:
    • Smoke request was recorded in /api/dashboard/requests.
    • 24h metrics showed 8 tracked requests, all routed to qwen2.5:14b.
    • Compression metrics are recorded, but current 24h savings were low: 25 tokens saved across 3 compression operations.
  • Not everything is currently going through the gateway:
    • codex-desktop is marked live because of tracked gateway callers, but the configured MacStudio bridge http://192.168.178.213:3253 was unreachable from Erik during the check.
    • microsoft-copilot is running locally but has 0 gateway requests and the configured MacStudio bridge http://192.168.178.213:3257 was unreachable from Erik.
    • GitHub Copilot bridge is running but returns auth_required.
    • M365 Copilot bridge is running but returns auth_required / missing Microsoft Graph auth.
    • Claude bridge is healthy and ready.
  • Security note:
    • Starting local Codex Bridge on 0.0.0.0 via PM2 was blocked by policy because it would expose local Codex access on the LAN. Use explicit approval plus a narrow network rule or a safer tunnel approach before enabling this persistently.

Active Work

  • Scope: Sync all chat history + context into sync/ handoff folder for Codex integration
  • Repos Modified: llm-gateway (sync/* only, no code changes)
  • Branch: main (no branching, sync/* only)

Current Tasks

  1. Create sync/README.md — handoff format documentation
  2. Create sync/CURRENT.md — this file, current status
  3. Create sync/history/2026-04-29-sync-handoff.md — session snapshot
  4. Git commit sync/* files
  5. Git push to Gitea (origin main)
  6. Notify Codex via handoff mechanism

Blockers

  • None — proceed with autonomous execution (per Memory: no confirmations needed)

Key Context

Projects Active

  1. LLM Gateway (/llm-gateway/)

    • Stack: Fastify TypeScript monorepo (gateway + learning + client + fine-tuner)
    • Live: https://llm-gateway.context-x.org (Port 3103 on Erik)
    • DB: PostgreSQL llm_gateway on Erik (user: llm, pw: llm_secure_2026)
    • Last Deploy: 2026-04-09 (23-dimension request scoring + free LLM fallback chain)
    • Status: Running (PM2 id 19+20)
    • Codex Bridge: New in this session — /copilot-bridge/server.js for Codex integration
  2. Transceiver Intelligence Platform (TIP) (github-repos/transceiver-db/)

    • Live: https://transceiver-db.fichtmueller.org
    • Stack: PostgreSQL 17 + TimescaleDB + Qdrant + Cloudflare R2
    • Features: Real-time pricing, Norton-Bass Hype Cycle, FAQ/KB, MCP Server
    • Blog LLM: claude-bridge provider (switched from Ollama 2026-04-09)
    • Status: Functional
  3. MAGATAMA Security Platform (in planning)

    • Status: S6 SHIN (ShieldX) + S2 TEN (ShieldY) functional
    • Next: S1/S3/S4/S5/S7 planning
    • Obsidian Docs: /Users/renefichtmueller/Documents/ObsidianBrain/projects/magatama/wiki/

Erik / Infrastructure Status

SSH Access

  • Primary: Port 22 (via UFW ALLOW from Rene home IP 83.135.64.79)
  • Backup: Port 2222 (systemd drop-in)
  • WireGuard: jumphost for remote access
  • Serial Console: sossh-rhr.online-server.cloud (IONOS OOB)

Running Services (Erik .82)

  • PostgreSQL 17 (llm_gateway, ctxmeet, others)
  • Proxmox (infrastructure, .10)
  • Ollama (via https://ollama.fichtmueller.org)
  • PM2 Services:
    • id 19+20: LLM Gateway (port 3103)
    • id 41: claude-bridge (port 3250)
    • peercortex (port 3101)
    • ctxevent/nognet (port 3001)
  • ⚠️ ShieldY: Unknown status — 846 restarts on Mac Studio (blocked until fixed)

Security Notes

  • SSH UFW rules: home IP whitelisted (Rule #1, #2 before LIMIT)
  • Backups: Daily to Fearghas (12h, /opt/scripts/daily-backup-fearghas.sh)
  • ⚠️ SFTP: Disabled on Synology (workaround: scp -O legacy mode in backup script)

Changed Files (Uncommitted)

From git status in llm-gateway:

Modified (code changes — NOT STAGED for sync commit):

  • Dockerfile, docker-compose.yaml
  • copilot-bridge/server.js
  • deploy/ecosystem.config.cjs, package-lock.json
  • packages/gateway/package.json, public/dashboard.html
  • packages/gateway/src/config/models.yaml
  • packages/gateway/src/modules/request-logger.ts
  • packages/gateway/src/pipeline/* (3 files)
  • packages/gateway/src/routes/* (3 files)
  • packages/gateway/src/security/tls-config.ts
  • packages/gateway/src/server.ts
  • packages/gateway/src/utils/tokenvault-hooks.ts

Untracked Dirs (NEW):

  • codex-bridge/
  • m365-copilot-bridge/
  • packages/browser-extension/
  • packages/companion/
  • packages/mcp-router/, packages/mcp-server/, packages/mcp-tools/

Untracked Files (DB migrations + modules):

  • 004-semantic-cache.sql, 005-fuzzy-cache.sql, 006-mcp-tool-calls.sql
  • admin-auth.ts, bridge-spawner.ts, caller-detection.ts, caller-stats.ts
  • context-compressor.ts, embedding-client.ts, gamification.ts
  • knowledge-memory.ts, memory-graph.ts, race-leaderboard.ts, race-mode.ts
  • report-generator.ts, response-cache.ts, savings-calculator.ts
  • settings-store.ts, share-card.ts, subscription-discovery.ts
  • subscription-wallet.ts

⚠️ POLICY: Only sync/* files committed/pushed in this session. Code changes staged separately (AFTER code review).


Next Safe Steps (for Codex / Next Claude Session)

Immediate (Safe to Execute)

  1. git add sync/* — stage handoff files only
  2. git commit -m "sync: add chat handoff for Codex integration (2026-04-29)" — commit
  3. git push origin main — push to Gitea

Code Review (After Handoff)

  1. Review copilot-bridge/server.js + new packages/* (code-reviewer agent)
  2. Security scan all new modules (security-reviewer agent)
  3. Stage + commit code changes in separate PR (per development-workflow.md)
  4. Deploy to Erik after approval

Codex Integration

  1. Codex reads this CURRENT.md on session start
  2. Codex continues with code review workflow (not skipping security)
  3. Codex pushes new history entry at session end

Warnings / Blockers

🔴 CRITICAL

  • ShieldY Mac Studio: 846 restarts — MUST FIX before production deployment
    • Issue: Unknown crash pattern
    • Next: Use debug skill to diagnose, then build-fix agent
    • Blocked: MAGATAMA deployment until resolved

🟡 MEDIUM

  • Codex Bridge: New component, needs security scan + testing
  • m365-copilot-bridge: New (untracked), purpose unknown — document + review
  • UFW SSH Rate Limiting: Rene home IP whitelisted, but new IPs could get blocked
    • Workaround: ufw insert 1 allow from <ip> to any port 22

🟢 LOW

  • SFTP disabled on Synology — currently using scp -O workaround (acceptable)
  • Ollama tunnel via Cloudflare (no direct IP) — acceptable for current load

Instructions for Codex / Next Session

On Session Start:

  1. cd /Users/renefichtmueller/Desktop/Claude\ Code/llm-gateway
  2. Read sync/CURRENT.md (this file) — has all context
  3. git status — should show only modifications (code) + untracked (code)
  4. Proceed with code review workflow (DON'T skip security)

On Session End:

  1. Create new sync/history/YYYY-MM-DD-topic.md entry (copy template below)
  2. Update sync/CURRENT.md with new status
  3. git add sync/* && git commit ... && git push (sync/* only)
  4. Code commits handled separately (per development-workflow.md)

History Entry Template:

# Session: [Topic] — 2026-04-DD

**Duration:** HH:MM
**Agent:** Codex / Claude Code Opus
**Status:** ✅ Complete / ⏳ Ongoing / ❌ Blocked

## Achievements
- [ ] Task 1
- [ ] Task 2

## Remaining
- [ ] Task 3 (blockers: X)
- [ ] Task 4 (next: Y)

## Files Changed
- code/* — staged for review
- sync/* — handoff updated

## Context Used
- ~XXX tokens (Haiku / Opus)
- Lean-ctx compression: Y% savings

End of CURRENT.md