sync: note claude code gateway fix

This commit is contained in:
Rene Fichtmueller 2026-05-12 22:56:24 +02:00
parent ebafb99645
commit ee9c1715ae
2 changed files with 56 additions and 1 deletions

View File

@ -10,7 +10,23 @@
## Session Status
### Latest Verified State — 2026-05-12 22:15 Europe/Berlin
### Latest Verified State — 2026-05-12 22:55 Europe/Berlin
- Claude Code full CLI smoke now reaches the local Gateway Companion and public Gateway reliably:
- Local Companion: `127.0.0.1:11435`.
- Claude env: `ANTHROPIC_BASE_URL=http://127.0.0.1:11435`, `ANTHROPIC_API_KEY=gateway`, default Sonnet `claude-sonnet-4-6`.
- Verified command returned exact clean result `claude-debug10-ok`.
- Dashboard rows show caller `claude-code-companion`, models `claude-sonnet-4-6` and `claude-haiku-3`, tokens/cost/latency tracked.
- Fixes applied during verification:
- Companion clamps Anthropic `max_tokens` to Gateway limit `16384`.
- Companion emits Anthropic-compatible SSE without double-writing headers.
- Companion sanitizes OpenAI-style assistant markers and prompt echo before returning to Claude Code.
- Companion message IDs now include a random suffix to avoid concurrent `generate_session_title` vs main-request collisions.
- Gateway live route bypasses response-cache for agentic callers containing `claude-code`, `codex`, or `copilot`; these are still tracked and compression metadata is still recorded.
- Important boundary:
- Claude Code text/CLI path is now usable through Gateway and tracked.
- Full Anthropic tool-use fidelity is still adapter-level, not native Anthropic API parity; current bridge flattens tool requests to text for Gateway routing.
- Small Claude Code smoke prompts often show `compression_mode=none:none` because there is no useful token reduction on tiny inputs; larger Codex test already proved `ctxlean-rtk` savings.
- Secure bridge architecture is now in place for Gateway-routed subscription access:
- MacStudio Codex bridge listens on `127.0.0.1:3253`.

View File

@ -0,0 +1,39 @@
# 2026-05-12 — Claude Code Gateway Fix
## Summary
Claude Code CLI now reaches the local Gateway Companion and the public LLM Gateway.
Verified smoke:
- Local endpoint: `ANTHROPIC_BASE_URL=http://127.0.0.1:11435`
- Model: `claude-sonnet-4-6`
- Result: `claude-debug10-ok`
- Gateway dashboard caller: `claude-code-companion`
- Dashboard tracked Sonnet and Haiku rows with tokens, cost, latency, and compression metadata.
## Fixes Applied
- Companion:
- Anthropic `/v1/messages` translation clamps `max_tokens` to Gateway limit `16384`.
- Streaming Anthropic responses no longer double-write HTTP headers.
- OpenAI-style assistant markers and prompt echo are sanitized before returning to Claude Code.
- Message IDs now include a random suffix to prevent concurrent Claude Code internal requests from colliding.
- Gateway:
- Response-cache bypass is enabled for agentic callers containing `claude-code`, `codex`, or `copilot`.
- These callers are still logged and compression metadata is still recorded.
- This avoids stale semantic-cache answers for coding agents.
## Verification Evidence
- Public health: `/api/dashboard/health` returned `ok`, database `connected`.
- Latest dashboard rows after the fix:
- `claude-code-companion`, `claude-sonnet-4-6`, `tokens_in=138`, `tokens_out=19`, latency about `441ms`.
- `claude-code-companion`, `claude-haiku-3`, title/internal request tracked separately.
## Boundaries
- Claude Code text/CLI path is usable through Gateway and tracked.
- Full native Anthropic tool-use parity is not complete; the Companion still flattens tool-related content into text for Gateway routing.
- Small smoke prompts often show `compression_mode=none:none`; this is expected when there are too few tokens to compress usefully.