1.7 KiB
1.7 KiB
2026-05-12 — Claude Code Gateway Fix
Summary
Claude Code CLI now reaches the local Gateway Companion and the public LLM Gateway.
Verified smoke:
- Local endpoint:
ANTHROPIC_BASE_URL=http://127.0.0.1:11435 - Model:
claude-sonnet-4-6 - Result:
claude-debug10-ok - Gateway dashboard caller:
claude-code-companion - Dashboard tracked Sonnet and Haiku rows with tokens, cost, latency, and compression metadata.
Fixes Applied
-
Companion:
- Anthropic
/v1/messagestranslation clampsmax_tokensto Gateway limit16384. - Streaming Anthropic responses no longer double-write HTTP headers.
- OpenAI-style assistant markers and prompt echo are sanitized before returning to Claude Code.
- Message IDs now include a random suffix to prevent concurrent Claude Code internal requests from colliding.
- Anthropic
-
Gateway:
- Response-cache bypass is enabled for agentic callers containing
claude-code,codex, orcopilot. - These callers are still logged and compression metadata is still recorded.
- This avoids stale semantic-cache answers for coding agents.
- Response-cache bypass is enabled for agentic callers containing
Verification Evidence
- Public health:
/api/dashboard/healthreturnedok, databaseconnected. - Latest dashboard rows after the fix:
claude-code-companion,claude-sonnet-4-6,tokens_in=138,tokens_out=19, latency about441ms.claude-code-companion,claude-haiku-3, title/internal request tracked separately.
Boundaries
- Claude Code text/CLI path is usable through Gateway and tracked.
- Full native Anthropic tool-use parity is not complete; the Companion still flattens tool-related content into text for Gateway routing.
- Small smoke prompts often show
compression_mode=none:none; this is expected when there are too few tokens to compress usefully.