diff --git a/sync/CURRENT.md b/sync/CURRENT.md index 09a8722..75a1f71 100644 --- a/sync/CURRENT.md +++ b/sync/CURRENT.md @@ -10,7 +10,23 @@ ## Session Status -### Latest Verified State — 2026-05-12 22:15 Europe/Berlin +### Latest Verified State — 2026-05-12 22:55 Europe/Berlin + +- Claude Code full CLI smoke now reaches the local Gateway Companion and public Gateway reliably: + - Local Companion: `127.0.0.1:11435`. + - Claude env: `ANTHROPIC_BASE_URL=http://127.0.0.1:11435`, `ANTHROPIC_API_KEY=gateway`, default Sonnet `claude-sonnet-4-6`. + - Verified command returned exact clean result `claude-debug10-ok`. + - Dashboard rows show caller `claude-code-companion`, models `claude-sonnet-4-6` and `claude-haiku-3`, tokens/cost/latency tracked. +- Fixes applied during verification: + - Companion clamps Anthropic `max_tokens` to Gateway limit `16384`. + - Companion emits Anthropic-compatible SSE without double-writing headers. + - Companion sanitizes OpenAI-style assistant markers and prompt echo before returning to Claude Code. + - Companion message IDs now include a random suffix to avoid concurrent `generate_session_title` vs main-request collisions. + - Gateway live route bypasses response-cache for agentic callers containing `claude-code`, `codex`, or `copilot`; these are still tracked and compression metadata is still recorded. +- Important boundary: + - Claude Code text/CLI path is now usable through Gateway and tracked. + - Full Anthropic tool-use fidelity is still adapter-level, not native Anthropic API parity; current bridge flattens tool requests to text for Gateway routing. + - Small Claude Code smoke prompts often show `compression_mode=none:none` because there is no useful token reduction on tiny inputs; larger Codex test already proved `ctxlean-rtk` savings. - Secure bridge architecture is now in place for Gateway-routed subscription access: - MacStudio Codex bridge listens on `127.0.0.1:3253`. diff --git a/sync/history/2026-05-12-claude-code-gateway-fix.md b/sync/history/2026-05-12-claude-code-gateway-fix.md new file mode 100644 index 0000000..282ea8d --- /dev/null +++ b/sync/history/2026-05-12-claude-code-gateway-fix.md @@ -0,0 +1,39 @@ +# 2026-05-12 — Claude Code Gateway Fix + +## Summary + +Claude Code CLI now reaches the local Gateway Companion and the public LLM Gateway. + +Verified smoke: + +- Local endpoint: `ANTHROPIC_BASE_URL=http://127.0.0.1:11435` +- Model: `claude-sonnet-4-6` +- Result: `claude-debug10-ok` +- Gateway dashboard caller: `claude-code-companion` +- Dashboard tracked Sonnet and Haiku rows with tokens, cost, latency, and compression metadata. + +## Fixes Applied + +- Companion: + - Anthropic `/v1/messages` translation clamps `max_tokens` to Gateway limit `16384`. + - Streaming Anthropic responses no longer double-write HTTP headers. + - OpenAI-style assistant markers and prompt echo are sanitized before returning to Claude Code. + - Message IDs now include a random suffix to prevent concurrent Claude Code internal requests from colliding. + +- Gateway: + - Response-cache bypass is enabled for agentic callers containing `claude-code`, `codex`, or `copilot`. + - These callers are still logged and compression metadata is still recorded. + - This avoids stale semantic-cache answers for coding agents. + +## Verification Evidence + +- Public health: `/api/dashboard/health` returned `ok`, database `connected`. +- Latest dashboard rows after the fix: + - `claude-code-companion`, `claude-sonnet-4-6`, `tokens_in=138`, `tokens_out=19`, latency about `441ms`. + - `claude-code-companion`, `claude-haiku-3`, title/internal request tracked separately. + +## Boundaries + +- Claude Code text/CLI path is usable through Gateway and tracked. +- Full native Anthropic tool-use parity is not complete; the Companion still flattens tool-related content into text for Gateway routing. +- Small smoke prompts often show `compression_mode=none:none`; this is expected when there are too few tokens to compress usefully.