sync: note claude code gateway fix
This commit is contained in:
parent
ebafb99645
commit
ee9c1715ae
@ -10,7 +10,23 @@
|
|||||||
|
|
||||||
## Session Status
|
## Session Status
|
||||||
|
|
||||||
### Latest Verified State — 2026-05-12 22:15 Europe/Berlin
|
### Latest Verified State — 2026-05-12 22:55 Europe/Berlin
|
||||||
|
|
||||||
|
- Claude Code full CLI smoke now reaches the local Gateway Companion and public Gateway reliably:
|
||||||
|
- Local Companion: `127.0.0.1:11435`.
|
||||||
|
- Claude env: `ANTHROPIC_BASE_URL=http://127.0.0.1:11435`, `ANTHROPIC_API_KEY=gateway`, default Sonnet `claude-sonnet-4-6`.
|
||||||
|
- Verified command returned exact clean result `claude-debug10-ok`.
|
||||||
|
- Dashboard rows show caller `claude-code-companion`, models `claude-sonnet-4-6` and `claude-haiku-3`, tokens/cost/latency tracked.
|
||||||
|
- Fixes applied during verification:
|
||||||
|
- Companion clamps Anthropic `max_tokens` to Gateway limit `16384`.
|
||||||
|
- Companion emits Anthropic-compatible SSE without double-writing headers.
|
||||||
|
- Companion sanitizes OpenAI-style assistant markers and prompt echo before returning to Claude Code.
|
||||||
|
- Companion message IDs now include a random suffix to avoid concurrent `generate_session_title` vs main-request collisions.
|
||||||
|
- Gateway live route bypasses response-cache for agentic callers containing `claude-code`, `codex`, or `copilot`; these are still tracked and compression metadata is still recorded.
|
||||||
|
- Important boundary:
|
||||||
|
- Claude Code text/CLI path is now usable through Gateway and tracked.
|
||||||
|
- Full Anthropic tool-use fidelity is still adapter-level, not native Anthropic API parity; current bridge flattens tool requests to text for Gateway routing.
|
||||||
|
- Small Claude Code smoke prompts often show `compression_mode=none:none` because there is no useful token reduction on tiny inputs; larger Codex test already proved `ctxlean-rtk` savings.
|
||||||
|
|
||||||
- Secure bridge architecture is now in place for Gateway-routed subscription access:
|
- Secure bridge architecture is now in place for Gateway-routed subscription access:
|
||||||
- MacStudio Codex bridge listens on `127.0.0.1:3253`.
|
- MacStudio Codex bridge listens on `127.0.0.1:3253`.
|
||||||
|
|||||||
39
sync/history/2026-05-12-claude-code-gateway-fix.md
Normal file
39
sync/history/2026-05-12-claude-code-gateway-fix.md
Normal file
@ -0,0 +1,39 @@
|
|||||||
|
# 2026-05-12 — Claude Code Gateway Fix
|
||||||
|
|
||||||
|
## Summary
|
||||||
|
|
||||||
|
Claude Code CLI now reaches the local Gateway Companion and the public LLM Gateway.
|
||||||
|
|
||||||
|
Verified smoke:
|
||||||
|
|
||||||
|
- Local endpoint: `ANTHROPIC_BASE_URL=http://127.0.0.1:11435`
|
||||||
|
- Model: `claude-sonnet-4-6`
|
||||||
|
- Result: `claude-debug10-ok`
|
||||||
|
- Gateway dashboard caller: `claude-code-companion`
|
||||||
|
- Dashboard tracked Sonnet and Haiku rows with tokens, cost, latency, and compression metadata.
|
||||||
|
|
||||||
|
## Fixes Applied
|
||||||
|
|
||||||
|
- Companion:
|
||||||
|
- Anthropic `/v1/messages` translation clamps `max_tokens` to Gateway limit `16384`.
|
||||||
|
- Streaming Anthropic responses no longer double-write HTTP headers.
|
||||||
|
- OpenAI-style assistant markers and prompt echo are sanitized before returning to Claude Code.
|
||||||
|
- Message IDs now include a random suffix to prevent concurrent Claude Code internal requests from colliding.
|
||||||
|
|
||||||
|
- Gateway:
|
||||||
|
- Response-cache bypass is enabled for agentic callers containing `claude-code`, `codex`, or `copilot`.
|
||||||
|
- These callers are still logged and compression metadata is still recorded.
|
||||||
|
- This avoids stale semantic-cache answers for coding agents.
|
||||||
|
|
||||||
|
## Verification Evidence
|
||||||
|
|
||||||
|
- Public health: `/api/dashboard/health` returned `ok`, database `connected`.
|
||||||
|
- Latest dashboard rows after the fix:
|
||||||
|
- `claude-code-companion`, `claude-sonnet-4-6`, `tokens_in=138`, `tokens_out=19`, latency about `441ms`.
|
||||||
|
- `claude-code-companion`, `claude-haiku-3`, title/internal request tracked separately.
|
||||||
|
|
||||||
|
## Boundaries
|
||||||
|
|
||||||
|
- Claude Code text/CLI path is usable through Gateway and tracked.
|
||||||
|
- Full native Anthropic tool-use parity is not complete; the Companion still flattens tool-related content into text for Gateway routing.
|
||||||
|
- Small smoke prompts often show `compression_mode=none:none`; this is expected when there are too few tokens to compress usefully.
|
||||||
Loading…
x
Reference in New Issue
Block a user