llm-gateway/sync/CURRENT.md
2026-05-12 22:20:04 +02:00

250 lines
10 KiB
Markdown

# Claude Code Context — 2026-04-29
**Last Updated:** 2026-04-29 ~20:30 (Session ongoing)
**Session Type:** LLM Gateway / Codex Bridge Handoff
**Working Directory:** `/Users/renefichtmueller/Desktop/Claude Code`
**Model:** Haiku 4.5 (default), Opus for deep reasoning
**Context Window:** Using lean-ctx MCP for compression
---
## Session Status
### Latest Verified State — 2026-05-12 22:15 Europe/Berlin
- Secure bridge architecture is now in place for Gateway-routed subscription access:
- MacStudio Codex bridge listens on `127.0.0.1:3253`.
- Local M365 bridge listens on `127.0.0.1:3257` but remains auth-required.
- Cloudflare-Access SSH reverse tunnel exposes only Erik loopback listeners `127.0.0.1:3353` and `127.0.0.1:3357`.
- Gateway live env points `CODEX_BRIDGE_URL` / `OPENAI_CODEX_URL` to `http://127.0.0.1:3353`.
- End-to-end Codex via Gateway works and is tracked:
- Caller `codex-secure-tunnel-smoke`.
- Model `gpt-5.1-codex-mini`.
- Dashboard request row recorded tokens, latency, cost, and compression metadata.
- New local Codex starts are configured for Gateway:
- `~/.codex/config.toml` default provider `llm-gateway`, `wire_api = "responses"`, `env_key = "LLM_GATEWAY_API_KEY"`.
- `~/.zshrc` sets OpenAI-compatible Gateway env vars and aliases `codex` to the Gateway profile.
- Local Gateway Companion is running on `127.0.0.1:11435` for desktop/CLI clients that need a local endpoint.
- It forwards OpenAI-compatible calls to `https://llm-gateway.context-x.org`.
- It translates Claude/Anthropic `/v1/messages` text calls to Gateway `/v1/chat/completions`.
- Claude Companion smoke with model `claude-sonnet-4-6` returned content and was tracked.
- Claude model alias warning:
- `claude-sonnet-4-1` was stale for current Claude Code bridge behavior and produced empty/failing output.
- Live Gateway provider metadata was corrected to expose `claude-sonnet-4-6`.
- `claude-sonnet-4-6`, `sonnet`, or default bridge model works.
- Remaining auth blockers:
- GitHub Copilot bridge remains `auth_required`.
- M365 Copilot bridge remains `auth_required` until real Microsoft Graph delegated auth/client config exists.
- Truth boundary:
- Gateway can track/compress only requests that enter it before provider execution.
- Existing native app sessions must be restarted or explicitly configured to use Gateway/Companion.
- Full Claude Code tool-call translation through Anthropic `/v1/messages` is not finished; current Companion support is text-compatible and enough for tracking text calls.
### Previous Verified State — 2026-05-12
- Public gateway is reachable:
- `/api/dashboard/health` returns `ok`, database `connected`.
- `/v1/models` returns the configured model list.
- `/v1/chat/completions` accepted a live smoke request from caller `codex-live-gateway-check` and returned `gateway-check-ok`.
- Tracking works for requests that actually enter the gateway:
- Smoke request was recorded in `/api/dashboard/requests`.
- 24h metrics showed `8` tracked requests, all routed to `qwen2.5:14b`.
- Compression metrics are recorded, but current 24h savings were low: `25` tokens saved across `3` compression operations.
- Not everything is currently going through the gateway:
- `codex-desktop` is marked `live` because of tracked gateway callers, but the configured MacStudio bridge `http://192.168.178.213:3253` was unreachable from Erik during the check.
- `microsoft-copilot` is running locally but has `0` gateway requests and the configured MacStudio bridge `http://192.168.178.213:3257` was unreachable from Erik.
- GitHub Copilot bridge is running but returns `auth_required`.
- M365 Copilot bridge is running but returns `auth_required` / missing Microsoft Graph auth.
- Claude bridge is healthy and ready.
- Security note:
- Starting local Codex Bridge on `0.0.0.0` via PM2 was blocked by policy because it would expose local Codex access on the LAN. Use explicit approval plus a narrow network rule or a safer tunnel approach before enabling this persistently.
### Active Work
- **Scope:** Sync all chat history + context into `sync/` handoff folder for Codex integration
- **Repos Modified:** llm-gateway (sync/* only, no code changes)
- **Branch:** main (no branching, sync/* only)
### Current Tasks
1. ✅ Create `sync/README.md` — handoff format documentation
2. ⏳ Create `sync/CURRENT.md` — this file, current status
3. ⏳ Create `sync/history/2026-04-29-sync-handoff.md` — session snapshot
4. ⏳ Git commit sync/* files
5. ⏳ Git push to Gitea (origin main)
6. ⏳ Notify Codex via handoff mechanism
### Blockers
- None — proceed with autonomous execution (per Memory: no confirmations needed)
---
## Key Context
### Projects Active
1. **LLM Gateway** (`/llm-gateway/`)
- Stack: Fastify TypeScript monorepo (gateway + learning + client + fine-tuner)
- Live: https://llm-gateway.context-x.org (Port 3103 on Erik)
- DB: PostgreSQL llm_gateway on Erik (user: llm, pw: llm_secure_2026)
- Last Deploy: 2026-04-09 (23-dimension request scoring + free LLM fallback chain)
- Status: ✅ Running (PM2 id 19+20)
- **Codex Bridge:** New in this session — `/copilot-bridge/server.js` for Codex integration
2. **Transceiver Intelligence Platform (TIP)** (`github-repos/transceiver-db/`)
- Live: https://transceiver-db.fichtmueller.org
- Stack: PostgreSQL 17 + TimescaleDB + Qdrant + Cloudflare R2
- Features: Real-time pricing, Norton-Bass Hype Cycle, FAQ/KB, MCP Server
- Blog LLM: claude-bridge provider (switched from Ollama 2026-04-09)
- Status: ✅ Functional
3. **MAGATAMA Security Platform** (in planning)
- Status: S6 SHIN (ShieldX) + S2 TEN (ShieldY) functional
- Next: S1/S3/S4/S5/S7 planning
- Obsidian Docs: `/Users/renefichtmueller/Documents/ObsidianBrain/projects/magatama/wiki/`
---
## Erik / Infrastructure Status
### SSH Access
- **Primary:** Port 22 (via UFW ALLOW from Rene home IP 83.135.64.79)
- **Backup:** Port 2222 (systemd drop-in)
- **WireGuard:** jumphost for remote access
- **Serial Console:** sossh-rhr.online-server.cloud (IONOS OOB)
### Running Services (Erik .82)
- ✅ PostgreSQL 17 (llm_gateway, ctxmeet, others)
- ✅ Proxmox (infrastructure, .10)
- ✅ Ollama (via https://ollama.fichtmueller.org)
- ✅ PM2 Services:
- id 19+20: LLM Gateway (port 3103)
- id 41: claude-bridge (port 3250)
- peercortex (port 3101)
- ctxevent/nognet (port 3001)
- ⚠️ ShieldY: **Unknown status** — 846 restarts on Mac Studio (blocked until fixed)
### Security Notes
- ✅ SSH UFW rules: home IP whitelisted (Rule #1, #2 before LIMIT)
- ✅ Backups: Daily to Fearghas (12h, `/opt/scripts/daily-backup-fearghas.sh`)
- ⚠️ SFTP: Disabled on Synology (workaround: `scp -O` legacy mode in backup script)
---
## Changed Files (Uncommitted)
From `git status` in llm-gateway:
**Modified (code changes — NOT STAGED for sync commit):**
- Dockerfile, docker-compose.yaml
- copilot-bridge/server.js
- deploy/ecosystem.config.cjs, package-lock.json
- packages/gateway/package.json, public/dashboard.html
- packages/gateway/src/config/models.yaml
- packages/gateway/src/modules/request-logger.ts
- packages/gateway/src/pipeline/* (3 files)
- packages/gateway/src/routes/* (3 files)
- packages/gateway/src/security/tls-config.ts
- packages/gateway/src/server.ts
- packages/gateway/src/utils/tokenvault-hooks.ts
**Untracked Dirs (NEW):**
- codex-bridge/
- m365-copilot-bridge/
- packages/browser-extension/
- packages/companion/
- packages/mcp-router/, packages/mcp-server/, packages/mcp-tools/
**Untracked Files (DB migrations + modules):**
- 004-semantic-cache.sql, 005-fuzzy-cache.sql, 006-mcp-tool-calls.sql
- admin-auth.ts, bridge-spawner.ts, caller-detection.ts, caller-stats.ts
- context-compressor.ts, embedding-client.ts, gamification.ts
- knowledge-memory.ts, memory-graph.ts, race-leaderboard.ts, race-mode.ts
- report-generator.ts, response-cache.ts, savings-calculator.ts
- settings-store.ts, share-card.ts, subscription-discovery.ts
- subscription-wallet.ts
**⚠️ POLICY:** Only `sync/*` files committed/pushed in this session. Code changes staged separately (AFTER code review).
---
## Next Safe Steps (for Codex / Next Claude Session)
### Immediate (Safe to Execute)
1.`git add sync/*` — stage handoff files only
2.`git commit -m "sync: add chat handoff for Codex integration (2026-04-29)"` — commit
3.`git push origin main` — push to Gitea
### Code Review (After Handoff)
1. Review copilot-bridge/server.js + new packages/* (code-reviewer agent)
2. Security scan all new modules (security-reviewer agent)
3. Stage + commit code changes in separate PR (per development-workflow.md)
4. Deploy to Erik after approval
### Codex Integration
1. Codex reads this CURRENT.md on session start
2. Codex continues with code review workflow (not skipping security)
3. Codex pushes new history entry at session end
---
## Warnings / Blockers
### 🔴 CRITICAL
- **ShieldY Mac Studio:** 846 restarts — MUST FIX before production deployment
- Issue: Unknown crash pattern
- Next: Use **debug** skill to diagnose, then **build-fix** agent
- Blocked: MAGATAMA deployment until resolved
### 🟡 MEDIUM
- **Codex Bridge:** New component, needs security scan + testing
- **m365-copilot-bridge:** New (untracked), purpose unknown — document + review
- **UFW SSH Rate Limiting:** Rene home IP whitelisted, but new IPs could get blocked
- Workaround: `ufw insert 1 allow from <ip> to any port 22`
### 🟢 LOW
- SFTP disabled on Synology — currently using scp -O workaround (acceptable)
- Ollama tunnel via Cloudflare (no direct IP) — acceptable for current load
---
## Instructions for Codex / Next Session
**On Session Start:**
1. `cd /Users/renefichtmueller/Desktop/Claude\ Code/llm-gateway`
2. Read `sync/CURRENT.md` (this file) — has all context
3. `git status` — should show only modifications (code) + untracked (code)
4. Proceed with code review workflow (DON'T skip security)
**On Session End:**
1. Create new `sync/history/YYYY-MM-DD-topic.md` entry (copy template below)
2. Update `sync/CURRENT.md` with new status
3. `git add sync/* && git commit ... && git push` (sync/* only)
4. Code commits handled separately (per development-workflow.md)
**History Entry Template:**
```markdown
# Session: [Topic] — 2026-04-DD
**Duration:** HH:MM
**Agent:** Codex / Claude Code Opus
**Status:** ✅ Complete / ⏳ Ongoing / ❌ Blocked
## Achievements
- [ ] Task 1
- [ ] Task 2
## Remaining
- [ ] Task 3 (blockers: X)
- [ ] Task 4 (next: Y)
## Files Changed
- code/* — staged for review
- sync/* — handoff updated
## Context Used
- ~XXX tokens (Haiku / Opus)
- Lean-ctx compression: Y% savings
```
---
**End of CURRENT.md**