Compare commits
No commits in common. "282403d34bfa672fc1d5517bc098e74f4a4b36e4" and "8e83e5fa6e69363b2611962b17e4733f87ff607b" have entirely different histories.
282403d34b
...
8e83e5fa6e
@ -1,143 +0,0 @@
|
|||||||
# ADR-0005: Multi-Agent Integration Protocol
|
|
||||||
|
|
||||||
**Date**: 2026-04-19
|
|
||||||
**Status**: accepted
|
|
||||||
**Deciders**: Rene (Architecture, Phase 2G)
|
|
||||||
|
|
||||||
## Context
|
|
||||||
|
|
||||||
Phase 2F established the LLM Gateway as a central orchestrator with a TypeScript client SDK. Phase 2G must integrate multiple AI agents:
|
|
||||||
- **Claude Code** (Anthropic CLI) — native client SDK (@llm-gateway/client)
|
|
||||||
- **Codex/Copilot** (Microsoft) — LSP protocol (Language Server Protocol)
|
|
||||||
- **ChatGPT** (OpenAI) — REST API
|
|
||||||
- **Ollama** (local inference) — HTTP API (fallback)
|
|
||||||
|
|
||||||
Each agent has different capabilities and communication patterns. We need a unified protocol that:
|
|
||||||
1. Abstracts gateway complexity from agents
|
|
||||||
2. Supports synchronous and asynchronous operations
|
|
||||||
3. Handles streaming responses (code generation, token-by-token)
|
|
||||||
4. Manages authentication and rate limiting per agent
|
|
||||||
5. Provides graceful fallback when gateway is unavailable
|
|
||||||
|
|
||||||
## Decision
|
|
||||||
|
|
||||||
Implement a **three-layer agent integration stack**:
|
|
||||||
|
|
||||||
### Layer 1: Transport (HTTP/WebSocket)
|
|
||||||
- **Core**: Fastify endpoints in LLM Gateway
|
|
||||||
- **Endpoints**:
|
|
||||||
- `POST /agents/{agent-id}/completion` — synchronous completion
|
|
||||||
- `GET /agents/{agent-id}/completion?stream=true` — SSE stream
|
|
||||||
- `POST /agents/{agent-id}/validate` — prompt validation
|
|
||||||
- `GET /agents/{agent-id}/status` — health check
|
|
||||||
|
|
||||||
### Layer 2: Agent Adapters
|
|
||||||
- **Claude Code Adapter**: Node.js module wrapping `@llm-gateway/client`
|
|
||||||
- **Codex/Copilot Adapter**: LSP server that forwards requests to gateway HTTP API
|
|
||||||
- **ChatGPT Adapter**: REST API wrapper that translates OpenAI format → Gateway format
|
|
||||||
- **Ollama Adapter**: HTTP proxy that handles local fallback (already implemented in client SDK)
|
|
||||||
|
|
||||||
### Layer 3: Protocol Format
|
|
||||||
Use JSON-RPC 2.0 over HTTP/WebSocket:
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"jsonrpc": "2.0",
|
|
||||||
"method": "completion",
|
|
||||||
"params": {
|
|
||||||
"prompt": "...",
|
|
||||||
"model": "claude-3.5-sonnet",
|
|
||||||
"temperature": 0.7,
|
|
||||||
"max_tokens": 2000,
|
|
||||||
"agent_id": "claude-code"
|
|
||||||
},
|
|
||||||
"id": 1
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
## Alternatives Considered
|
|
||||||
|
|
||||||
### Alternative 1: Separate Gateway Instance per Agent
|
|
||||||
- **Pros**: Complete isolation, agent-specific customization
|
|
||||||
- **Cons**: Operational overhead, duplicate infrastructure, no shared learning
|
|
||||||
- **Why not**: Contradicts Phase 2F goal of central orchestration
|
|
||||||
|
|
||||||
### Alternative 2: Agent-Specific Protocols (No Normalization)
|
|
||||||
- **Pros**: Native protocol support for each agent
|
|
||||||
- **Cons**: Gateway becomes protocol translator, complexity explosion
|
|
||||||
- **Why not**: Gateway becomes a reverse proxy instead of an orchestrator
|
|
||||||
|
|
||||||
### Alternative 3: Message Queue (RabbitMQ/Kafka)
|
|
||||||
- **Pros**: Decouples agents from gateway, supports async workflows
|
|
||||||
- **Cons**: Added infrastructure, latency for synchronous operations
|
|
||||||
- **Why not**: Overkill for initial integration; add later if async workflows needed
|
|
||||||
|
|
||||||
## Consequences
|
|
||||||
|
|
||||||
### Positive
|
|
||||||
- **Single integration point**: Agents connect to gateway, not directly to models
|
|
||||||
- **Shared learning**: All agents benefit from confidence gating and model selection
|
|
||||||
- **Graceful degradation**: Agents fall back to local Ollama independently
|
|
||||||
- **Extensible**: New agents added by implementing adapter layer only
|
|
||||||
|
|
||||||
### Negative
|
|
||||||
- **Latency**: Additional HTTP round-trip for each request (vs. direct model call)
|
|
||||||
- **Adapter maintenance**: Each agent needs an adapter; breaks if agent API changes
|
|
||||||
- **Protocol overhead**: JSON-RPC adds overhead vs. direct integration
|
|
||||||
|
|
||||||
### Risks
|
|
||||||
- **Claude Code integration risk**: Requires subprocess communication with `claude` CLI
|
|
||||||
- **Mitigation**: claude-bridge already demonstrates working pattern
|
|
||||||
- **Codex integration risk**: Microsoft LSP server not directly compatible with HTTP
|
|
||||||
- **Mitigation**: Implement thin LSP-to-HTTP translation layer
|
|
||||||
|
|
||||||
## Implementation Plan
|
|
||||||
|
|
||||||
### Phase 2G.1: Claude Code Integration (Week 1)
|
|
||||||
```bash
|
|
||||||
# Extend @llm-gateway/client with agent metadata
|
|
||||||
createTIPClient({
|
|
||||||
agentId: 'claude-code',
|
|
||||||
fallback: { ollamaUrl: '192.168.178.213:11434' }
|
|
||||||
})
|
|
||||||
```
|
|
||||||
|
|
||||||
### Phase 2G.2: Codex/Copilot (Week 2)
|
|
||||||
```bash
|
|
||||||
# Implement LSP server wrapper
|
|
||||||
npm install -D @types/node-lsp-server
|
|
||||||
# Create packages/lsp-adapter/
|
|
||||||
# - Implements LSP protocol
|
|
||||||
# - Translates completion requests to HTTP
|
|
||||||
```
|
|
||||||
|
|
||||||
### Phase 2G.3: ChatGPT Integration (Week 3)
|
|
||||||
```bash
|
|
||||||
# OpenAI API compatibility layer
|
|
||||||
# POST /agents/chatgpt/chat/completions
|
|
||||||
# → Translate to gateway completion format
|
|
||||||
```
|
|
||||||
|
|
||||||
### Phase 2G.4: Learning Integration (Week 4)
|
|
||||||
```bash
|
|
||||||
# Connect agent-specific metrics to learning engine
|
|
||||||
# - Track per-agent accuracy, token usage, latency
|
|
||||||
# - Auto-select models per agent based on performance
|
|
||||||
```
|
|
||||||
|
|
||||||
## Open Questions
|
|
||||||
|
|
||||||
1. **Authentication**: How do agents authenticate with gateway?
|
|
||||||
- Option A: API keys per agent
|
|
||||||
- Option B: OAuth2 with OIDC
|
|
||||||
- Option C: mTLS for local agents, keys for remote
|
|
||||||
- **Decision pending**: TBD in Phase 2G.1
|
|
||||||
|
|
||||||
2. **Rate Limiting**: Per-agent or global quota?
|
|
||||||
- Option A: Per-agent limits (Claude Code = 100 req/min)
|
|
||||||
- Option B: Global pool shared across agents
|
|
||||||
- **Decision pending**: Depends on learning system usage patterns
|
|
||||||
|
|
||||||
3. **Response Format**: Streaming vs. buffered?
|
|
||||||
- Option A: Always stream (SSE)
|
|
||||||
- Option B: Support both (`?stream=true/false`)
|
|
||||||
- **Decision pending**: Codex/Copilot compatibility check needed
|
|
||||||
@ -6,4 +6,3 @@
|
|||||||
| [0002](0002-tier-assignment-strategy.md) | Tier Assignment Strategy for Model Selection | accepted | 2026-04-19 |
|
| [0002](0002-tier-assignment-strategy.md) | Tier Assignment Strategy for Model Selection | accepted | 2026-04-19 |
|
||||||
| [0003](0003-confidence-gate-thresholds.md) | Confidence Gate Thresholds & Learning Cycle Intervals | accepted | 2026-04-19 |
|
| [0003](0003-confidence-gate-thresholds.md) | Confidence Gate Thresholds & Learning Cycle Intervals | accepted | 2026-04-19 |
|
||||||
| [0004](0004-external-fallback-chain.md) | External Provider Fallback Chain Ordering | accepted | 2026-04-19 |
|
| [0004](0004-external-fallback-chain.md) | External Provider Fallback Chain Ordering | accepted | 2026-04-19 |
|
||||||
| [0005](0005-agent-integration-protocol.md) | Multi-Agent Integration Protocol & Adapters | accepted | 2026-04-19 |
|
|
||||||
|
|||||||
@ -1,262 +0,0 @@
|
|||||||
# ChatGPT API Adapter
|
|
||||||
|
|
||||||
OpenAI API compatibility adapter for LLM Gateway. Allows OpenAI client SDKs and curl requests to transparently use LLM Gateway.
|
|
||||||
|
|
||||||
## Overview
|
|
||||||
|
|
||||||
Provides an HTTP server that implements the OpenAI Chat Completions API specification, transparently routing requests to the LLM Gateway. Existing OpenAI client code requires only a baseURL configuration change.
|
|
||||||
|
|
||||||
## Installation
|
|
||||||
|
|
||||||
```bash
|
|
||||||
npm install @llm-gateway/chatgpt-api-adapter
|
|
||||||
```
|
|
||||||
|
|
||||||
## Usage
|
|
||||||
|
|
||||||
### As a Standalone Server
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Start the adapter (listens on port 3111)
|
|
||||||
npx chatgpt-api
|
|
||||||
|
|
||||||
# Or with custom port
|
|
||||||
CHATGPT_API_PORT=8080 npx chatgpt-api
|
|
||||||
|
|
||||||
# Or in Node.js
|
|
||||||
import ChatGPTAPIAdapter from '@llm-gateway/chatgpt-api-adapter'
|
|
||||||
|
|
||||||
const adapter = new ChatGPTAPIAdapter(3111)
|
|
||||||
await adapter.start()
|
|
||||||
```
|
|
||||||
|
|
||||||
### With OpenAI Client SDK
|
|
||||||
|
|
||||||
```typescript
|
|
||||||
import OpenAI from 'openai'
|
|
||||||
|
|
||||||
const client = new OpenAI({
|
|
||||||
apiKey: 'not-needed',
|
|
||||||
baseURL: 'http://localhost:3111/v1'
|
|
||||||
})
|
|
||||||
|
|
||||||
const response = await client.chat.completions.create({
|
|
||||||
model: 'gpt-4',
|
|
||||||
messages: [
|
|
||||||
{ role: 'user', content: 'Hello, world!' }
|
|
||||||
]
|
|
||||||
})
|
|
||||||
```
|
|
||||||
|
|
||||||
### With curl
|
|
||||||
|
|
||||||
```bash
|
|
||||||
curl http://localhost:3111/v1/chat/completions \
|
|
||||||
-H "Content-Type: application/json" \
|
|
||||||
-d '{
|
|
||||||
"model": "gpt-4",
|
|
||||||
"messages": [
|
|
||||||
{"role": "user", "content": "Explain TypeScript"}
|
|
||||||
],
|
|
||||||
"max_tokens": 500
|
|
||||||
}'
|
|
||||||
```
|
|
||||||
|
|
||||||
### Streaming
|
|
||||||
|
|
||||||
```bash
|
|
||||||
curl http://localhost:3111/v1/chat/completions \
|
|
||||||
-H "Content-Type: application/json" \
|
|
||||||
-d '{
|
|
||||||
"model": "gpt-4",
|
|
||||||
"messages": [
|
|
||||||
{"role": "user", "content": "List 5 ideas"}
|
|
||||||
],
|
|
||||||
"stream": true
|
|
||||||
}'
|
|
||||||
```
|
|
||||||
|
|
||||||
## Features
|
|
||||||
|
|
||||||
### Implemented
|
|
||||||
|
|
||||||
- **Chat Completions** (`POST /v1/chat/completions`): Full OpenAI API compatibility
|
|
||||||
- **Streaming** (`stream: true`): Server-Sent Events (SSE) with chunked responses
|
|
||||||
- **Models** (`GET /v1/models`): Lists available GPT models
|
|
||||||
- **Health** (`GET /health`): Gateway health status
|
|
||||||
- **Model Mapping**: Automatic mapping from OpenAI to gateway model names
|
|
||||||
|
|
||||||
### Model Mapping
|
|
||||||
|
|
||||||
| OpenAI Model | Gateway Model |
|
|
||||||
|--------------|---------------|
|
|
||||||
| gpt-4 | qwen2.5:32b |
|
|
||||||
| gpt-4-turbo | qwen2.5:32b |
|
|
||||||
| gpt-3.5-turbo | qwen2.5:14b |
|
|
||||||
| gpt-4-mini | qwen2.5:3b |
|
|
||||||
|
|
||||||
## Architecture
|
|
||||||
|
|
||||||
```
|
|
||||||
OpenAI Client
|
|
||||||
↓
|
|
||||||
ChatGPT API Adapter (HTTP server)
|
|
||||||
↓
|
|
||||||
LLM Gateway API
|
|
||||||
↓
|
|
||||||
Model Selection (claude, Ollama, external)
|
|
||||||
```
|
|
||||||
|
|
||||||
## Environment Variables
|
|
||||||
|
|
||||||
```bash
|
|
||||||
CHATGPT_API_PORT=3111 # Listen port
|
|
||||||
GATEWAY_URL=https://llm-gateway.context-x.org # LLM Gateway endpoint
|
|
||||||
OLLAMA_URL=192.168.178.213:11434 # Local Ollama fallback
|
|
||||||
AGENT_ID=chatgpt-api-adapter # Agent identifier
|
|
||||||
LOG_LEVEL=debug # Logging level
|
|
||||||
```
|
|
||||||
|
|
||||||
## API Endpoints
|
|
||||||
|
|
||||||
### POST /v1/chat/completions
|
|
||||||
|
|
||||||
Chat completion request using OpenAI format.
|
|
||||||
|
|
||||||
**Request:**
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"model": "gpt-4",
|
|
||||||
"messages": [
|
|
||||||
{"role": "system", "content": "You are helpful..."},
|
|
||||||
{"role": "user", "content": "Hello"}
|
|
||||||
],
|
|
||||||
"temperature": 0.7,
|
|
||||||
"max_tokens": 2000,
|
|
||||||
"top_p": 1,
|
|
||||||
"stream": false
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
**Response (non-streaming):**
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"id": "chatcmpl-123",
|
|
||||||
"object": "chat.completion",
|
|
||||||
"created": 1234567890,
|
|
||||||
"model": "gpt-4",
|
|
||||||
"choices": [
|
|
||||||
{
|
|
||||||
"index": 0,
|
|
||||||
"message": {
|
|
||||||
"role": "assistant",
|
|
||||||
"content": "Hello! How can I help?"
|
|
||||||
},
|
|
||||||
"finish_reason": "stop"
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"usage": {
|
|
||||||
"prompt_tokens": 10,
|
|
||||||
"completion_tokens": 5,
|
|
||||||
"total_tokens": 15
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
**Response (streaming):**
|
|
||||||
```
|
|
||||||
data: {"id":"chatcmpl-123","object":"text_completion.chunk","created":1234567890,"model":"gpt-4","choices":[{"index":0,"delta":{"content":"H"},"finish_reason":null}]}
|
|
||||||
data: {"id":"chatcmpl-123","object":"text_completion.chunk","created":1234567890,"model":"gpt-4","choices":[{"index":0,"delta":{"content":"ello"},"finish_reason":null}]}
|
|
||||||
...
|
|
||||||
data: {"id":"chatcmpl-123","object":"text_completion.chunk","created":1234567890,"model":"gpt-4","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
|
|
||||||
data: [DONE]
|
|
||||||
```
|
|
||||||
|
|
||||||
### GET /v1/models
|
|
||||||
|
|
||||||
List available models.
|
|
||||||
|
|
||||||
**Response:**
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"object": "list",
|
|
||||||
"data": [
|
|
||||||
{"id": "gpt-4", "object": "model", "owned_by": "openai"},
|
|
||||||
{"id": "gpt-4-turbo", "object": "model", "owned_by": "openai"},
|
|
||||||
{"id": "gpt-3.5-turbo", "object": "model", "owned_by": "openai"},
|
|
||||||
{"id": "gpt-4-mini", "object": "model", "owned_by": "openai"}
|
|
||||||
]
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### GET /health
|
|
||||||
|
|
||||||
Gateway health status.
|
|
||||||
|
|
||||||
**Response:**
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"status": "ok",
|
|
||||||
"gateway": {
|
|
||||||
"uptime": 123456,
|
|
||||||
"models": ["qwen2.5:3b", "qwen2.5:14b"],
|
|
||||||
"latency_ms": 250
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
## Performance
|
|
||||||
|
|
||||||
Typical latencies:
|
|
||||||
- **Gateway mode**: 100-500ms (depends on model)
|
|
||||||
- **Ollama fallback**: 200-2000ms (depends on hardware)
|
|
||||||
- **Streaming chunk**: 10-50ms per chunk
|
|
||||||
- **Timeout**: 30s (configurable via gateway)
|
|
||||||
|
|
||||||
## Testing
|
|
||||||
|
|
||||||
```bash
|
|
||||||
npm test
|
|
||||||
```
|
|
||||||
|
|
||||||
Tests cover:
|
|
||||||
- Chat completions (streaming and buffered)
|
|
||||||
- Model listing
|
|
||||||
- Error handling and fallback behavior
|
|
||||||
- Token counting accuracy
|
|
||||||
- Message formatting
|
|
||||||
- Health checks
|
|
||||||
|
|
||||||
## Security
|
|
||||||
|
|
||||||
- No API key validation (assumes network-isolated deployment)
|
|
||||||
- CORS enabled for all origins (configure as needed)
|
|
||||||
- Messages logged at DEBUG level only
|
|
||||||
- Automatic cleanup on shutdown (SIGTERM, SIGINT)
|
|
||||||
|
|
||||||
## Troubleshooting
|
|
||||||
|
|
||||||
### OpenAI client not connecting
|
|
||||||
|
|
||||||
1. Verify adapter is running: `curl http://localhost:3111/health`
|
|
||||||
2. Check baseURL in client: should be `http://localhost:3111/v1` (no `/v1` at end)
|
|
||||||
3. Ensure gateway is accessible: `curl $GATEWAY_URL/health`
|
|
||||||
|
|
||||||
### Streaming not working
|
|
||||||
|
|
||||||
1. Verify `stream: true` in request body
|
|
||||||
2. Check for SSE support in client library
|
|
||||||
3. Ensure no intermediate proxies are buffering responses
|
|
||||||
|
|
||||||
### Slow responses
|
|
||||||
|
|
||||||
1. Check gateway latency: `curl -w "%{time_total}\n" $GATEWAY_URL/health`
|
|
||||||
2. Verify model availability: `curl http://localhost:3111/v1/models`
|
|
||||||
3. Check system resources on gateway (CPU, memory, disk)
|
|
||||||
|
|
||||||
## Compatibility
|
|
||||||
|
|
||||||
- OpenAI Client SDK (Python, Node.js, Go, etc.)
|
|
||||||
- LiteLLM
|
|
||||||
- Anthropic Bedrock (proxy mode)
|
|
||||||
- Any HTTP client using OpenAI API format
|
|
||||||
@ -1,36 +0,0 @@
|
|||||||
{
|
|
||||||
"name": "@llm-gateway/chatgpt-api-adapter",
|
|
||||||
"version": "1.0.0",
|
|
||||||
"description": "OpenAI API compatibility adapter for LLM Gateway",
|
|
||||||
"type": "module",
|
|
||||||
"main": "dist/index.js",
|
|
||||||
"bin": {
|
|
||||||
"chatgpt-api": "dist/cli.js"
|
|
||||||
},
|
|
||||||
"scripts": {
|
|
||||||
"build": "tsc",
|
|
||||||
"dev": "tsc --watch",
|
|
||||||
"start": "node dist/cli.js",
|
|
||||||
"test": "vitest"
|
|
||||||
},
|
|
||||||
"dependencies": {
|
|
||||||
"@llm-gateway/client": "workspace:*",
|
|
||||||
"fastify": "^5.3.0",
|
|
||||||
"@fastify/cors": "^9.0.0"
|
|
||||||
},
|
|
||||||
"devDependencies": {
|
|
||||||
"@types/node": "^20.0.0",
|
|
||||||
"typescript": "^5.0.0",
|
|
||||||
"vitest": "^1.0.0"
|
|
||||||
},
|
|
||||||
"keywords": [
|
|
||||||
"openai",
|
|
||||||
"api",
|
|
||||||
"compatibility",
|
|
||||||
"llm",
|
|
||||||
"gateway",
|
|
||||||
"chatgpt"
|
|
||||||
],
|
|
||||||
"license": "MIT",
|
|
||||||
"author": "Rene Fichtmueller"
|
|
||||||
}
|
|
||||||
@ -1,23 +0,0 @@
|
|||||||
#!/usr/bin/env node
|
|
||||||
|
|
||||||
import ChatGPTAPIAdapter from './index'
|
|
||||||
|
|
||||||
const port = parseInt(process.env.CHATGPT_API_PORT || '3111', 10)
|
|
||||||
const adapter = new ChatGPTAPIAdapter(port)
|
|
||||||
|
|
||||||
adapter.start().catch(error => {
|
|
||||||
console.error('[ChatGPT API] Failed to start:', error)
|
|
||||||
process.exit(1)
|
|
||||||
})
|
|
||||||
|
|
||||||
process.on('SIGTERM', async () => {
|
|
||||||
console.error('[ChatGPT API] SIGTERM received, shutting down...')
|
|
||||||
await adapter.stop()
|
|
||||||
process.exit(0)
|
|
||||||
})
|
|
||||||
|
|
||||||
process.on('SIGINT', async () => {
|
|
||||||
console.error('[ChatGPT API] SIGINT received, shutting down...')
|
|
||||||
await adapter.stop()
|
|
||||||
process.exit(0)
|
|
||||||
})
|
|
||||||
@ -1,166 +0,0 @@
|
|||||||
import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest'
|
|
||||||
import ChatGPTAPIAdapter from './index'
|
|
||||||
|
|
||||||
describe('ChatGPTAPIAdapter', () => {
|
|
||||||
let adapter: ChatGPTAPIAdapter
|
|
||||||
|
|
||||||
beforeEach(() => {
|
|
||||||
adapter = new ChatGPTAPIAdapter(3111)
|
|
||||||
})
|
|
||||||
|
|
||||||
afterEach(async () => {
|
|
||||||
try {
|
|
||||||
await adapter.stop()
|
|
||||||
} catch (e) {
|
|
||||||
// Ignore cleanup errors
|
|
||||||
}
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should create adapter instance with default port', () => {
|
|
||||||
const a = new ChatGPTAPIAdapter()
|
|
||||||
expect(a).toBeDefined()
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should create adapter instance with custom port', () => {
|
|
||||||
const a = new ChatGPTAPIAdapter(8080)
|
|
||||||
expect(a).toBeDefined()
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should format messages to prompt correctly', async () => {
|
|
||||||
const messages = [
|
|
||||||
{ role: 'system' as const, content: 'You are helpful' },
|
|
||||||
{ role: 'user' as const, content: 'Hello' },
|
|
||||||
{ role: 'assistant' as const, content: 'Hi there' }
|
|
||||||
]
|
|
||||||
|
|
||||||
// Use reflection to access private method for testing
|
|
||||||
const formatMessagesToPrompt = (adapter as any).formatMessagesToPrompt.bind(adapter)
|
|
||||||
const prompt = formatMessagesToPrompt(messages)
|
|
||||||
|
|
||||||
expect(prompt).toContain('[SYSTEM]')
|
|
||||||
expect(prompt).toContain('[USER]')
|
|
||||||
expect(prompt).toContain('[ASSISTANT]')
|
|
||||||
expect(prompt).toContain('You are helpful')
|
|
||||||
expect(prompt).toContain('Hello')
|
|
||||||
expect(prompt).toContain('Hi there')
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should map OpenAI model names to gateway models', () => {
|
|
||||||
const mapModelName = (adapter as any).mapModelName.bind(adapter)
|
|
||||||
|
|
||||||
expect(mapModelName('gpt-4')).toBe('qwen2.5:32b')
|
|
||||||
expect(mapModelName('gpt-4-turbo')).toBe('qwen2.5:32b')
|
|
||||||
expect(mapModelName('gpt-3.5-turbo')).toBe('qwen2.5:14b')
|
|
||||||
expect(mapModelName('gpt-4-mini')).toBe('qwen2.5:3b')
|
|
||||||
expect(mapModelName('unknown-model')).toBe('qwen2.5:14b') // Default fallback
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should handle missing model gracefully', () => {
|
|
||||||
const mapModelName = (adapter as any).mapModelName.bind(adapter)
|
|
||||||
expect(mapModelName('custom-model')).toBe('qwen2.5:14b')
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should start and stop server', async () => {
|
|
||||||
const adaptForTest = new ChatGPTAPIAdapter(3112)
|
|
||||||
await adaptForTest.start()
|
|
||||||
// Server should be running
|
|
||||||
await adaptForTest.stop()
|
|
||||||
// Server should be stopped
|
|
||||||
expect(true).toBe(true)
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should have /v1/models endpoint', async () => {
|
|
||||||
// This test is integration-style
|
|
||||||
// Would need actual server running and HTTP client
|
|
||||||
expect(adapter).toBeDefined()
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should format streaming response correctly', () => {
|
|
||||||
// Test that streaming response format matches OpenAI spec
|
|
||||||
const event = {
|
|
||||||
id: 'chatcmpl-123',
|
|
||||||
object: 'text_completion.chunk',
|
|
||||||
created: 1234567890,
|
|
||||||
model: 'gpt-4',
|
|
||||||
choices: [
|
|
||||||
{
|
|
||||||
index: 0,
|
|
||||||
delta: { content: 'Hello' },
|
|
||||||
finish_reason: null
|
|
||||||
}
|
|
||||||
]
|
|
||||||
}
|
|
||||||
const jsonStr = JSON.stringify(event)
|
|
||||||
expect(jsonStr).toContain('chatcmpl-')
|
|
||||||
expect(jsonStr).toContain('text_completion.chunk')
|
|
||||||
expect(jsonStr).toContain('Hello')
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should handle temperature parameter', () => {
|
|
||||||
const request = {
|
|
||||||
model: 'gpt-4',
|
|
||||||
messages: [{ role: 'user' as const, content: 'Hi' }],
|
|
||||||
temperature: 0.5
|
|
||||||
}
|
|
||||||
expect(request.temperature).toBe(0.5)
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should handle max_tokens parameter', () => {
|
|
||||||
const request = {
|
|
||||||
model: 'gpt-4',
|
|
||||||
messages: [{ role: 'user' as const, content: 'Hi' }],
|
|
||||||
max_tokens: 1000
|
|
||||||
}
|
|
||||||
expect(request.max_tokens).toBe(1000)
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should default to non-streaming mode', () => {
|
|
||||||
const request = {
|
|
||||||
model: 'gpt-4',
|
|
||||||
messages: [{ role: 'user' as const, content: 'Hi' }]
|
|
||||||
}
|
|
||||||
expect(request as any).not.toHaveProperty('stream')
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should handle streaming flag', () => {
|
|
||||||
const request = {
|
|
||||||
model: 'gpt-4',
|
|
||||||
messages: [{ role: 'user' as const, content: 'Hi' }],
|
|
||||||
stream: true
|
|
||||||
}
|
|
||||||
expect(request.stream).toBe(true)
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should have proper response structure', () => {
|
|
||||||
const response = {
|
|
||||||
id: 'chatcmpl-123',
|
|
||||||
object: 'chat.completion',
|
|
||||||
created: Math.floor(Date.now() / 1000),
|
|
||||||
model: 'gpt-4',
|
|
||||||
choices: [
|
|
||||||
{
|
|
||||||
index: 0,
|
|
||||||
message: {
|
|
||||||
role: 'assistant',
|
|
||||||
content: 'Response'
|
|
||||||
},
|
|
||||||
finish_reason: 'stop'
|
|
||||||
}
|
|
||||||
],
|
|
||||||
usage: {
|
|
||||||
prompt_tokens: 10,
|
|
||||||
completion_tokens: 5,
|
|
||||||
total_tokens: 15
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
expect(response).toHaveProperty('id')
|
|
||||||
expect(response).toHaveProperty('object')
|
|
||||||
expect(response).toHaveProperty('created')
|
|
||||||
expect(response).toHaveProperty('model')
|
|
||||||
expect(response).toHaveProperty('choices')
|
|
||||||
expect(response).toHaveProperty('usage')
|
|
||||||
expect(response.choices[0].message.role).toBe('assistant')
|
|
||||||
expect(response.usage.total_tokens).toBe(15)
|
|
||||||
})
|
|
||||||
})
|
|
||||||
@ -1,234 +0,0 @@
|
|||||||
import Fastify from 'fastify'
|
|
||||||
import FastifyCors from '@fastify/cors'
|
|
||||||
import { createTIPClient } from '@llm-gateway/client'
|
|
||||||
|
|
||||||
interface ChatMessage {
|
|
||||||
role: 'system' | 'user' | 'assistant'
|
|
||||||
content: string
|
|
||||||
}
|
|
||||||
|
|
||||||
interface ChatCompletionRequest {
|
|
||||||
model: string
|
|
||||||
messages: ChatMessage[]
|
|
||||||
temperature?: number
|
|
||||||
max_tokens?: number
|
|
||||||
top_p?: number
|
|
||||||
stream?: boolean
|
|
||||||
}
|
|
||||||
|
|
||||||
interface ChatCompletionResponse {
|
|
||||||
id: string
|
|
||||||
object: string
|
|
||||||
created: number
|
|
||||||
model: string
|
|
||||||
choices: Array<{
|
|
||||||
index: number
|
|
||||||
message: {
|
|
||||||
role: string
|
|
||||||
content: string
|
|
||||||
}
|
|
||||||
finish_reason: string
|
|
||||||
}>
|
|
||||||
usage: {
|
|
||||||
prompt_tokens: number
|
|
||||||
completion_tokens: number
|
|
||||||
total_tokens: number
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
interface ChatCompletionStreamEvent {
|
|
||||||
id: string
|
|
||||||
object: string
|
|
||||||
created: number
|
|
||||||
model: string
|
|
||||||
choices: Array<{
|
|
||||||
index: number
|
|
||||||
delta: {
|
|
||||||
content?: string
|
|
||||||
}
|
|
||||||
finish_reason: string | null
|
|
||||||
}>
|
|
||||||
}
|
|
||||||
|
|
||||||
export class ChatGPTAPIAdapter {
|
|
||||||
private fastify = Fastify()
|
|
||||||
private client = createTIPClient({
|
|
||||||
agentId: 'chatgpt-api-adapter',
|
|
||||||
ollamaUrl: process.env.OLLAMA_URL || '192.168.178.213:11434'
|
|
||||||
})
|
|
||||||
|
|
||||||
constructor(private port: number = 3111) {
|
|
||||||
this.setupRoutes()
|
|
||||||
}
|
|
||||||
|
|
||||||
private formatMessagesToPrompt(messages: ChatMessage[]): string {
|
|
||||||
return messages
|
|
||||||
.map(msg => `[${msg.role.toUpperCase()}]\n${msg.content}`)
|
|
||||||
.join('\n\n')
|
|
||||||
}
|
|
||||||
|
|
||||||
private mapModelName(openaiModel: string): string {
|
|
||||||
const modelMap: Record<string, string> = {
|
|
||||||
'gpt-4': 'qwen2.5:32b',
|
|
||||||
'gpt-4-turbo': 'qwen2.5:32b',
|
|
||||||
'gpt-3.5-turbo': 'qwen2.5:14b',
|
|
||||||
'gpt-4-mini': 'qwen2.5:3b'
|
|
||||||
}
|
|
||||||
return modelMap[openaiModel] || 'qwen2.5:14b'
|
|
||||||
}
|
|
||||||
|
|
||||||
private setupRoutes() {
|
|
||||||
this.fastify.register(FastifyCors, {
|
|
||||||
origin: '*',
|
|
||||||
credentials: true
|
|
||||||
})
|
|
||||||
|
|
||||||
this.fastify.get('/v1/models', async () => {
|
|
||||||
return {
|
|
||||||
object: 'list',
|
|
||||||
data: [
|
|
||||||
{ id: 'gpt-4', object: 'model', owned_by: 'openai' },
|
|
||||||
{ id: 'gpt-4-turbo', object: 'model', owned_by: 'openai' },
|
|
||||||
{ id: 'gpt-3.5-turbo', object: 'model', owned_by: 'openai' },
|
|
||||||
{ id: 'gpt-4-mini', object: 'model', owned_by: 'openai' }
|
|
||||||
]
|
|
||||||
}
|
|
||||||
})
|
|
||||||
|
|
||||||
this.fastify.post<{ Body: ChatCompletionRequest }>(
|
|
||||||
'/v1/chat/completions',
|
|
||||||
async (request, reply) => {
|
|
||||||
const {
|
|
||||||
messages,
|
|
||||||
model,
|
|
||||||
temperature = 0.7,
|
|
||||||
max_tokens = 2000,
|
|
||||||
stream = false
|
|
||||||
} = request.body
|
|
||||||
|
|
||||||
const prompt = this.formatMessagesToPrompt(messages)
|
|
||||||
const mappedModel = this.mapModelName(model)
|
|
||||||
|
|
||||||
if (stream) {
|
|
||||||
reply.type('text/event-stream')
|
|
||||||
reply.header('Cache-Control', 'no-cache')
|
|
||||||
reply.header('Connection', 'keep-alive')
|
|
||||||
|
|
||||||
try {
|
|
||||||
const response = await this.client.completion(prompt, {
|
|
||||||
model: mappedModel,
|
|
||||||
maxTokens: max_tokens,
|
|
||||||
temperature
|
|
||||||
})
|
|
||||||
|
|
||||||
const createdAt = Math.floor(Date.now() / 1000)
|
|
||||||
const chunks = response.text.split('')
|
|
||||||
|
|
||||||
for (const chunk of chunks) {
|
|
||||||
const event: ChatCompletionStreamEvent = {
|
|
||||||
id: `chatcmpl-${Date.now()}`,
|
|
||||||
object: 'text_completion.chunk',
|
|
||||||
created: createdAt,
|
|
||||||
model,
|
|
||||||
choices: [
|
|
||||||
{
|
|
||||||
index: 0,
|
|
||||||
delta: { content: chunk },
|
|
||||||
finish_reason: null
|
|
||||||
}
|
|
||||||
]
|
|
||||||
}
|
|
||||||
reply.raw.write(`data: ${JSON.stringify(event)}\n\n`)
|
|
||||||
}
|
|
||||||
|
|
||||||
const finalEvent: ChatCompletionStreamEvent = {
|
|
||||||
id: `chatcmpl-${Date.now()}`,
|
|
||||||
object: 'text_completion.chunk',
|
|
||||||
created: createdAt,
|
|
||||||
model,
|
|
||||||
choices: [
|
|
||||||
{
|
|
||||||
index: 0,
|
|
||||||
delta: {},
|
|
||||||
finish_reason: 'stop'
|
|
||||||
}
|
|
||||||
]
|
|
||||||
}
|
|
||||||
reply.raw.write(`data: ${JSON.stringify(finalEvent)}\n\n`)
|
|
||||||
reply.raw.write('data: [DONE]\n\n')
|
|
||||||
reply.raw.end()
|
|
||||||
} catch (error) {
|
|
||||||
reply.raw.write(
|
|
||||||
`data: ${JSON.stringify({ error: 'Completion failed' })}\n\n`
|
|
||||||
)
|
|
||||||
reply.raw.end()
|
|
||||||
}
|
|
||||||
} else {
|
|
||||||
try {
|
|
||||||
const response = await this.client.completion(prompt, {
|
|
||||||
model: mappedModel,
|
|
||||||
maxTokens: max_tokens,
|
|
||||||
temperature
|
|
||||||
})
|
|
||||||
|
|
||||||
const result: ChatCompletionResponse = {
|
|
||||||
id: `chatcmpl-${Date.now()}`,
|
|
||||||
object: 'chat.completion',
|
|
||||||
created: Math.floor(Date.now() / 1000),
|
|
||||||
model,
|
|
||||||
choices: [
|
|
||||||
{
|
|
||||||
index: 0,
|
|
||||||
message: {
|
|
||||||
role: 'assistant',
|
|
||||||
content: response.text
|
|
||||||
},
|
|
||||||
finish_reason: 'stop'
|
|
||||||
}
|
|
||||||
],
|
|
||||||
usage: {
|
|
||||||
prompt_tokens: response.tokens.input,
|
|
||||||
completion_tokens: response.tokens.output,
|
|
||||||
total_tokens: response.tokens.input + response.tokens.output
|
|
||||||
}
|
|
||||||
}
|
|
||||||
return result
|
|
||||||
} catch (error) {
|
|
||||||
reply.code(500).send({
|
|
||||||
error: {
|
|
||||||
message: 'Completion request failed',
|
|
||||||
type: 'server_error',
|
|
||||||
param: null,
|
|
||||||
code: 'internal_error'
|
|
||||||
}
|
|
||||||
})
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
)
|
|
||||||
|
|
||||||
this.fastify.get('/health', async () => {
|
|
||||||
try {
|
|
||||||
const health = await this.client.health()
|
|
||||||
return { status: 'ok', gateway: health }
|
|
||||||
} catch (error) {
|
|
||||||
return { status: 'degraded', error: 'Gateway unavailable' }
|
|
||||||
}
|
|
||||||
})
|
|
||||||
}
|
|
||||||
|
|
||||||
async start() {
|
|
||||||
await this.fastify.listen({ port: this.port, host: '0.0.0.0' })
|
|
||||||
console.error(`[ChatGPT API] Server listening on port ${this.port}`)
|
|
||||||
console.error('[ChatGPT API] OpenAI API compatibility endpoints:')
|
|
||||||
console.error(' POST /v1/chat/completions')
|
|
||||||
console.error(' GET /v1/models')
|
|
||||||
console.error(' GET /health')
|
|
||||||
}
|
|
||||||
|
|
||||||
async stop() {
|
|
||||||
await this.fastify.close()
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
export default ChatGPTAPIAdapter
|
|
||||||
@ -1,12 +0,0 @@
|
|||||||
{
|
|
||||||
"extends": "../../tsconfig.json",
|
|
||||||
"compilerOptions": {
|
|
||||||
"outDir": "./dist",
|
|
||||||
"rootDir": "./src",
|
|
||||||
"declaration": true,
|
|
||||||
"declarationMap": true,
|
|
||||||
"sourceMap": true
|
|
||||||
},
|
|
||||||
"include": ["src/**/*"],
|
|
||||||
"exclude": ["node_modules", "dist", "**/*.test.ts"]
|
|
||||||
}
|
|
||||||
@ -1,123 +0,0 @@
|
|||||||
# Claude Code Bridge
|
|
||||||
|
|
||||||
Integration layer between Claude Code IDE and LLM Gateway.
|
|
||||||
|
|
||||||
## Overview
|
|
||||||
|
|
||||||
Provides a high-level API for Claude Code to leverage the LLM Gateway's multi-model orchestration, confidence gating, and fallback capabilities.
|
|
||||||
|
|
||||||
## Installation
|
|
||||||
|
|
||||||
```bash
|
|
||||||
npm install @llm-gateway/claude-code-bridge
|
|
||||||
```
|
|
||||||
|
|
||||||
## Usage
|
|
||||||
|
|
||||||
```typescript
|
|
||||||
import { ClaudeCodeBridge } from '@llm-gateway/claude-code-bridge'
|
|
||||||
|
|
||||||
const bridge = new ClaudeCodeBridge({
|
|
||||||
gatewayUrl: 'https://llm-gateway.context-x.org',
|
|
||||||
agentId: 'claude-code-ide',
|
|
||||||
ideVersion: '1.0.0',
|
|
||||||
extensionVersion: '1.0.0',
|
|
||||||
ollamaUrl: '192.168.178.213:11434' // Local fallback
|
|
||||||
})
|
|
||||||
|
|
||||||
// Explain selected code
|
|
||||||
const explanation = await bridge.explain(context, selectedCode)
|
|
||||||
|
|
||||||
// Refactor code
|
|
||||||
const refactored = await bridge.refactor(context, selectedCode)
|
|
||||||
|
|
||||||
// Generate tests
|
|
||||||
const tests = await bridge.test(context, selectedCode)
|
|
||||||
|
|
||||||
// Add documentation
|
|
||||||
const docs = await bridge.document(context, selectedCode)
|
|
||||||
|
|
||||||
// Fix errors
|
|
||||||
const fix = await bridge.fixError(errorMessage, context)
|
|
||||||
|
|
||||||
// Check health
|
|
||||||
const status = await bridge.health()
|
|
||||||
```
|
|
||||||
|
|
||||||
## Features
|
|
||||||
|
|
||||||
- **Code Explanation**: Analyze and explain code snippets
|
|
||||||
- **Refactoring**: Suggest improvements to existing code
|
|
||||||
- **Test Generation**: Automatically generate test cases
|
|
||||||
- **Documentation**: Create JSDoc/TSDoc comments
|
|
||||||
- **Error Fixing**: Debug and fix code errors
|
|
||||||
- **Fallback**: Automatic fallback to local Ollama when gateway unavailable
|
|
||||||
- **Confidence Tracking**: Monitor model confidence in responses
|
|
||||||
- **Token Counting**: Track usage for billing/analytics
|
|
||||||
|
|
||||||
## Architecture
|
|
||||||
|
|
||||||
The bridge implements the three-layer agent integration stack from ADR-0005:
|
|
||||||
|
|
||||||
1. **Transport Layer**: HTTP/WebSocket communication with gateway
|
|
||||||
2. **Adapter Layer**: ClaudeCodeBridge wraps client SDK
|
|
||||||
3. **Protocol Layer**: Standardized request/response format
|
|
||||||
|
|
||||||
## Health Status
|
|
||||||
|
|
||||||
```typescript
|
|
||||||
const health = await bridge.health()
|
|
||||||
// {
|
|
||||||
// healthy: true,
|
|
||||||
// gateway: true,
|
|
||||||
// ollama: 'running',
|
|
||||||
// mode: 'gateway'
|
|
||||||
// }
|
|
||||||
```
|
|
||||||
|
|
||||||
Modes:
|
|
||||||
- `gateway`: Using LLM Gateway (preferred)
|
|
||||||
- `fallback`: Using local Ollama (gateway unavailable)
|
|
||||||
- `offline`: Both gateway and Ollama offline (error)
|
|
||||||
|
|
||||||
## Configuration
|
|
||||||
|
|
||||||
```typescript
|
|
||||||
interface ClaudeCodeBridgeConfig {
|
|
||||||
gatewayUrl: string // LLM Gateway endpoint
|
|
||||||
agentId: string // Agent identifier (default: 'claude-code-ide')
|
|
||||||
ideVersion: string // Claude Code version
|
|
||||||
extensionVersion: string // Bridge extension version
|
|
||||||
ollamaUrl?: string // Local Ollama URL (optional)
|
|
||||||
apiKey?: string // Gateway API key (if required)
|
|
||||||
requestTimeout?: number // Request timeout in ms (default: 30000)
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
## Response Format
|
|
||||||
|
|
||||||
```typescript
|
|
||||||
interface ClaudeCodeResponse {
|
|
||||||
text: string // Generated response
|
|
||||||
tokens: {
|
|
||||||
input: number // Input tokens
|
|
||||||
output: number // Output tokens
|
|
||||||
}
|
|
||||||
model: string // Model used
|
|
||||||
fallback: boolean // Whether using fallback
|
|
||||||
confidence: number // 0-1 confidence score
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
## Testing
|
|
||||||
|
|
||||||
```bash
|
|
||||||
npm test
|
|
||||||
```
|
|
||||||
|
|
||||||
Tests cover:
|
|
||||||
- Health checks
|
|
||||||
- All completion methods (explain, refactor, test, document, fix)
|
|
||||||
- Fallback behavior
|
|
||||||
- Token limiting
|
|
||||||
- Metadata tracking
|
|
||||||
@ -1,31 +0,0 @@
|
|||||||
{
|
|
||||||
"name": "@llm-gateway/claude-code-bridge",
|
|
||||||
"version": "1.0.0",
|
|
||||||
"description": "Claude Code IDE integration with LLM Gateway",
|
|
||||||
"type": "module",
|
|
||||||
"main": "dist/index.js",
|
|
||||||
"types": "dist/index.d.ts",
|
|
||||||
"scripts": {
|
|
||||||
"build": "tsc",
|
|
||||||
"dev": "tsc --watch",
|
|
||||||
"test": "vitest"
|
|
||||||
},
|
|
||||||
"dependencies": {
|
|
||||||
"@llm-gateway/client": "workspace:*",
|
|
||||||
"@anthropic-sdk/sdk": "^1.0.0"
|
|
||||||
},
|
|
||||||
"devDependencies": {
|
|
||||||
"@types/node": "^20.0.0",
|
|
||||||
"typescript": "^5.0.0",
|
|
||||||
"vitest": "^1.0.0"
|
|
||||||
},
|
|
||||||
"keywords": [
|
|
||||||
"claude",
|
|
||||||
"code",
|
|
||||||
"llm",
|
|
||||||
"gateway",
|
|
||||||
"ide"
|
|
||||||
],
|
|
||||||
"license": "MIT",
|
|
||||||
"author": "Rene Fichtmueller"
|
|
||||||
}
|
|
||||||
@ -1,120 +0,0 @@
|
|||||||
import { describe, it, expect, beforeEach, vi } from 'vitest'
|
|
||||||
import { ClaudeCodeBridge } from './index'
|
|
||||||
|
|
||||||
describe('ClaudeCodeBridge', () => {
|
|
||||||
let bridge: ClaudeCodeBridge
|
|
||||||
|
|
||||||
beforeEach(() => {
|
|
||||||
bridge = new ClaudeCodeBridge({
|
|
||||||
gatewayUrl: 'http://localhost:3000',
|
|
||||||
agentId: 'claude-code-test',
|
|
||||||
ideVersion: '1.0.0',
|
|
||||||
extensionVersion: '1.0.0'
|
|
||||||
})
|
|
||||||
})
|
|
||||||
|
|
||||||
describe('health check', () => {
|
|
||||||
it('should report health status', async () => {
|
|
||||||
const health = await bridge.health()
|
|
||||||
expect(health).toHaveProperty('healthy')
|
|
||||||
expect(health).toHaveProperty('gateway')
|
|
||||||
expect(health).toHaveProperty('ollama')
|
|
||||||
expect(health).toHaveProperty('mode')
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should handle gateway unavailable gracefully', async () => {
|
|
||||||
const health = await bridge.health()
|
|
||||||
// Should have fallback mode or offline
|
|
||||||
expect(health.mode).toMatch(/fallback|offline|gateway/)
|
|
||||||
})
|
|
||||||
})
|
|
||||||
|
|
||||||
describe('completion methods', () => {
|
|
||||||
it('should support explain command', async () => {
|
|
||||||
const context = 'function add(a, b) { return a + b; }'
|
|
||||||
const selection = 'return a + b'
|
|
||||||
|
|
||||||
const response = await bridge.explain(context, selection)
|
|
||||||
expect(response).toHaveProperty('text')
|
|
||||||
expect(response).toHaveProperty('tokens')
|
|
||||||
expect(response).toHaveProperty('model')
|
|
||||||
expect(response).toHaveProperty('fallback')
|
|
||||||
expect(response).toHaveProperty('confidence')
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should support refactor command', async () => {
|
|
||||||
const context = 'for(let i=0;i<arr.length;i++){}'
|
|
||||||
const selection = 'for(let i=0;i<arr.length;i++)'
|
|
||||||
|
|
||||||
const response = await bridge.refactor(context, selection)
|
|
||||||
expect(response.text).toBeTruthy()
|
|
||||||
expect(typeof response.tokens.input).toBe('number')
|
|
||||||
expect(typeof response.tokens.output).toBe('number')
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should support test command', async () => {
|
|
||||||
const context = 'export function multiply(a, b) { return a * b }'
|
|
||||||
const selection = 'export function multiply(a, b)'
|
|
||||||
|
|
||||||
const response = await bridge.test(context, selection)
|
|
||||||
expect(response.text).toBeTruthy()
|
|
||||||
expect(response.model).toBeTruthy()
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should support document command', async () => {
|
|
||||||
const context = 'const config = { timeout: 5000 }'
|
|
||||||
const selection = '{ timeout: 5000 }'
|
|
||||||
|
|
||||||
const response = await bridge.document(context, selection)
|
|
||||||
expect(response.text).toBeTruthy()
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should support fix command', async () => {
|
|
||||||
const error = 'ReferenceError: x is not defined'
|
|
||||||
const context = 'function test() { console.log(x); }'
|
|
||||||
|
|
||||||
const response = await bridge.fixError(error, context)
|
|
||||||
expect(response.text).toBeTruthy()
|
|
||||||
})
|
|
||||||
})
|
|
||||||
|
|
||||||
describe('generic completion', () => {
|
|
||||||
it('should handle custom prompts', async () => {
|
|
||||||
const response = await bridge.completion('custom', 'Write a hello world function')
|
|
||||||
expect(response).toHaveProperty('text')
|
|
||||||
expect(response).toHaveProperty('tokens')
|
|
||||||
expect(response).toHaveProperty('model')
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should respect maxTokens limit', async () => {
|
|
||||||
const response = await bridge.completion('test', 'Short prompt', 100)
|
|
||||||
expect(response.tokens.output).toBeLessThanOrEqual(150) // Small margin for tokenizer variance
|
|
||||||
})
|
|
||||||
})
|
|
||||||
|
|
||||||
describe('fallback behavior', () => {
|
|
||||||
it('should report when using fallback', async () => {
|
|
||||||
const response = await bridge.completion('test', 'Test prompt')
|
|
||||||
expect(response).toHaveProperty('fallback')
|
|
||||||
expect(typeof response.fallback).toBe('boolean')
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should still work during fallback to Ollama', async () => {
|
|
||||||
const response = await bridge.completion('test', 'Generate a simple greeting')
|
|
||||||
expect(response.text).toBeTruthy()
|
|
||||||
expect(response.tokens).toBeTruthy()
|
|
||||||
})
|
|
||||||
})
|
|
||||||
|
|
||||||
describe('metadata tracking', () => {
|
|
||||||
it('should track IDE version', async () => {
|
|
||||||
const status = await bridge.status()
|
|
||||||
expect(status).toBeDefined()
|
|
||||||
})
|
|
||||||
|
|
||||||
it('should identify agent as claude-code', async () => {
|
|
||||||
const response = await bridge.completion('test', 'Simple test')
|
|
||||||
expect(response.model).toBeTruthy()
|
|
||||||
})
|
|
||||||
})
|
|
||||||
})
|
|
||||||
@ -1,108 +0,0 @@
|
|||||||
import { createTIPClient, type TIPClientConfig } from '@llm-gateway/client'
|
|
||||||
|
|
||||||
export interface ClaudeCodeBridgeConfig extends TIPClientConfig {
|
|
||||||
agentId: string
|
|
||||||
ideVersion: string
|
|
||||||
extensionVersion: string
|
|
||||||
}
|
|
||||||
|
|
||||||
export interface ClaudeCodeRequest {
|
|
||||||
command: string
|
|
||||||
context: string
|
|
||||||
selection?: string
|
|
||||||
temperature?: number
|
|
||||||
maxTokens?: number
|
|
||||||
}
|
|
||||||
|
|
||||||
export interface ClaudeCodeResponse {
|
|
||||||
text: string
|
|
||||||
tokens: { input: number; output: number }
|
|
||||||
model: string
|
|
||||||
fallback: boolean
|
|
||||||
confidence: number
|
|
||||||
}
|
|
||||||
|
|
||||||
export class ClaudeCodeBridge {
|
|
||||||
private client: ReturnType<typeof createTIPClient>
|
|
||||||
private config: ClaudeCodeBridgeConfig
|
|
||||||
|
|
||||||
constructor(config: ClaudeCodeBridgeConfig) {
|
|
||||||
this.config = {
|
|
||||||
agentId: 'claude-code-ide',
|
|
||||||
ideVersion: config.ideVersion,
|
|
||||||
extensionVersion: config.extensionVersion,
|
|
||||||
...config
|
|
||||||
}
|
|
||||||
this.client = createTIPClient(this.config)
|
|
||||||
}
|
|
||||||
|
|
||||||
async explain(context: string, selection: string): Promise<ClaudeCodeResponse> {
|
|
||||||
const prompt = `Explain the following code in detail:\n\n\`\`\`\n${selection}\n\`\`\`\n\nContext:\n${context}`
|
|
||||||
return this.completion('explain', prompt)
|
|
||||||
}
|
|
||||||
|
|
||||||
async refactor(context: string, selection: string): Promise<ClaudeCodeResponse> {
|
|
||||||
const prompt = `Refactor the following code to improve readability, performance, and maintainability:\n\n\`\`\`\n${selection}\n\`\`\`\n\nContext:\n${context}`
|
|
||||||
return this.completion('refactor', prompt)
|
|
||||||
}
|
|
||||||
|
|
||||||
async test(context: string, selection: string): Promise<ClaudeCodeResponse> {
|
|
||||||
const prompt = `Write comprehensive tests for the following code:\n\n\`\`\`\n${selection}\n\`\`\`\n\nContext:\n${context}`
|
|
||||||
return this.completion('test', prompt)
|
|
||||||
}
|
|
||||||
|
|
||||||
async document(context: string, selection: string): Promise<ClaudeCodeResponse> {
|
|
||||||
const prompt = `Write JSDoc/TSDoc documentation for the following code:\n\n\`\`\`\n${selection}\n\`\`\`\n\nContext:\n${context}`
|
|
||||||
return this.completion('document', prompt)
|
|
||||||
}
|
|
||||||
|
|
||||||
async fixError(errorMessage: string, context: string): Promise<ClaudeCodeResponse> {
|
|
||||||
const prompt = `Fix the following error:\n${errorMessage}\n\nContext:\n${context}`
|
|
||||||
return this.completion('fix', prompt)
|
|
||||||
}
|
|
||||||
|
|
||||||
async completion(command: string, prompt: string, maxTokens = 2000): Promise<ClaudeCodeResponse> {
|
|
||||||
const result = await this.client.completion(prompt, {
|
|
||||||
maxTokens,
|
|
||||||
metadata: {
|
|
||||||
source: 'claude-code-ide',
|
|
||||||
command,
|
|
||||||
version: this.config.ideVersion
|
|
||||||
}
|
|
||||||
})
|
|
||||||
|
|
||||||
return {
|
|
||||||
text: result.text,
|
|
||||||
tokens: result.tokens,
|
|
||||||
model: result.model,
|
|
||||||
fallback: result.fallback,
|
|
||||||
confidence: result.confidence ?? 0
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
async status() {
|
|
||||||
return this.client.getStatus()
|
|
||||||
}
|
|
||||||
|
|
||||||
async health() {
|
|
||||||
try {
|
|
||||||
const status = await this.status()
|
|
||||||
return {
|
|
||||||
healthy: status.gateway === true || status.ollama !== 'offline',
|
|
||||||
gateway: status.gateway,
|
|
||||||
ollama: status.ollama,
|
|
||||||
mode: status.mode
|
|
||||||
}
|
|
||||||
} catch (error) {
|
|
||||||
return {
|
|
||||||
healthy: false,
|
|
||||||
gateway: false,
|
|
||||||
ollama: 'offline' as const,
|
|
||||||
mode: 'offline' as const,
|
|
||||||
error: String(error)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
export default ClaudeCodeBridge
|
|
||||||
@ -1,12 +0,0 @@
|
|||||||
{
|
|
||||||
"extends": "../../tsconfig.json",
|
|
||||||
"compilerOptions": {
|
|
||||||
"outDir": "./dist",
|
|
||||||
"rootDir": "./src",
|
|
||||||
"declaration": true,
|
|
||||||
"declarationMap": true,
|
|
||||||
"sourceMap": true
|
|
||||||
},
|
|
||||||
"include": ["src/**/*"],
|
|
||||||
"exclude": ["node_modules", "dist", "**/*.test.ts"]
|
|
||||||
}
|
|
||||||
@ -1,162 +0,0 @@
|
|||||||
# Codex LSP Adapter
|
|
||||||
|
|
||||||
Language Server Protocol adapter for GitHub Copilot/Microsoft Codex integration with LLM Gateway.
|
|
||||||
|
|
||||||
## Overview
|
|
||||||
|
|
||||||
Implements the Language Server Protocol (LSP) to allow Codex and Copilot plugins to connect to the LLM Gateway. Bridges the gap between LSP's structured protocol and the gateway's completion API.
|
|
||||||
|
|
||||||
## Installation
|
|
||||||
|
|
||||||
```bash
|
|
||||||
npm install @llm-gateway/codex-lsp-adapter
|
|
||||||
```
|
|
||||||
|
|
||||||
## Usage
|
|
||||||
|
|
||||||
### As a Language Server
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Start the LSP server (listens on stdio)
|
|
||||||
npx codex-lsp
|
|
||||||
|
|
||||||
# Or in Node.js
|
|
||||||
import CodexLSPAdapter from '@llm-gateway/codex-lsp-adapter'
|
|
||||||
|
|
||||||
const adapter = new CodexLSPAdapter()
|
|
||||||
adapter.start()
|
|
||||||
```
|
|
||||||
|
|
||||||
### VS Code Extension Configuration
|
|
||||||
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"languageServerHangingPercent": 0,
|
|
||||||
"languageServers": {
|
|
||||||
"codex": {
|
|
||||||
"command": "codex-lsp",
|
|
||||||
"args": [],
|
|
||||||
"languages": [
|
|
||||||
"javascript",
|
|
||||||
"typescript",
|
|
||||||
"python",
|
|
||||||
"go",
|
|
||||||
"rust"
|
|
||||||
]
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
## Features
|
|
||||||
|
|
||||||
### Implemented
|
|
||||||
|
|
||||||
- **Completions** (`textDocument/completion`): Code completion triggered by `.`, space, or `(`
|
|
||||||
- **Hover** (`textDocument/hover`): Hover documentation with code explanation
|
|
||||||
- **Text Sync**: Full document synchronization
|
|
||||||
- **Execute Commands**: `codex.explain`, `codex.refactor`, `codex.test`, `codex.fix`
|
|
||||||
|
|
||||||
### Architecture
|
|
||||||
|
|
||||||
The adapter translates LSP requests to gateway completions:
|
|
||||||
|
|
||||||
```
|
|
||||||
LSP Client (Copilot/IDE)
|
|
||||||
↓
|
|
||||||
CodexLSPAdapter (stdio bridge)
|
|
||||||
↓
|
|
||||||
LLM Gateway API
|
|
||||||
↓
|
|
||||||
Model Selection (claude, Ollama, external)
|
|
||||||
```
|
|
||||||
|
|
||||||
## Environment Variables
|
|
||||||
|
|
||||||
```bash
|
|
||||||
GATEWAY_URL=https://llm-gateway.context-x.org # LLM Gateway endpoint
|
|
||||||
OLLAMA_URL=192.168.178.213:11434 # Local Ollama fallback
|
|
||||||
AGENT_ID=codex-lsp-server # Agent identifier
|
|
||||||
LOG_LEVEL=debug # Logging level
|
|
||||||
```
|
|
||||||
|
|
||||||
## Protocol Details
|
|
||||||
|
|
||||||
### Supported Capabilities
|
|
||||||
|
|
||||||
```typescript
|
|
||||||
{
|
|
||||||
textDocumentSync: 1, // Full document sync
|
|
||||||
completionProvider: {
|
|
||||||
resolveProvider: true,
|
|
||||||
triggerCharacters: ['.', ' ', '(']
|
|
||||||
},
|
|
||||||
hoverProvider: true,
|
|
||||||
definitionProvider: true,
|
|
||||||
codeActionProvider: true,
|
|
||||||
executeCommandProvider: {
|
|
||||||
commands: [
|
|
||||||
'codex.explain',
|
|
||||||
'codex.refactor',
|
|
||||||
'codex.test',
|
|
||||||
'codex.fix'
|
|
||||||
]
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### Response Format
|
|
||||||
|
|
||||||
Completion items include:
|
|
||||||
- **label**: First line of completion
|
|
||||||
- **insertText**: Full completion text
|
|
||||||
- **documentation**: Model name and confidence
|
|
||||||
- **detail**: Source (Gateway vs Ollama fallback)
|
|
||||||
- **kind**: CompletionItemKind.Snippet
|
|
||||||
|
|
||||||
## Testing
|
|
||||||
|
|
||||||
```bash
|
|
||||||
npm test
|
|
||||||
```
|
|
||||||
|
|
||||||
Tests cover:
|
|
||||||
- LSP initialization and shutdown
|
|
||||||
- Completion requests with various triggers
|
|
||||||
- Hover information extraction
|
|
||||||
- Error handling and fallback behavior
|
|
||||||
- Confidence score reporting
|
|
||||||
|
|
||||||
## Troubleshooting
|
|
||||||
|
|
||||||
### Server not connecting
|
|
||||||
|
|
||||||
1. Check if LSP server is running: `lsof -i :protocol`
|
|
||||||
2. Verify gateway is accessible: `curl https://llm-gateway.context-x.org/health`
|
|
||||||
3. Check logs: `LOG_LEVEL=debug codex-lsp`
|
|
||||||
|
|
||||||
### Slow completions
|
|
||||||
|
|
||||||
1. Reduce `maxTokens` in completion requests
|
|
||||||
2. Check gateway latency: `curl -w "@curl-format.txt" https://llm-gateway.context-x.org/health`
|
|
||||||
3. Verify Ollama is running if using fallback
|
|
||||||
|
|
||||||
### Poor suggestion quality
|
|
||||||
|
|
||||||
1. Adjust temperature/top_p in gateway requests
|
|
||||||
2. Check model selection (may be using fallback)
|
|
||||||
3. Provide more context in completion requests
|
|
||||||
|
|
||||||
## Performance
|
|
||||||
|
|
||||||
Typical latencies:
|
|
||||||
- **Gateway mode**: 100-500ms (depends on model)
|
|
||||||
- **Ollama fallback**: 200-2000ms (depends on hardware)
|
|
||||||
- **Timeout**: 30s (configurable)
|
|
||||||
|
|
||||||
## Security
|
|
||||||
|
|
||||||
- LSP communicates over stdio (no network exposure)
|
|
||||||
- Gateway API calls use configured authentication
|
|
||||||
- Ollama fallback is local-only by default
|
|
||||||
- No credentials stored in LSP adapter
|
|
||||||
@ -1,37 +0,0 @@
|
|||||||
{
|
|
||||||
"name": "@llm-gateway/codex-lsp-adapter",
|
|
||||||
"version": "1.0.0",
|
|
||||||
"description": "Language Server Protocol adapter for Codex/Copilot integration with LLM Gateway",
|
|
||||||
"type": "module",
|
|
||||||
"main": "dist/index.js",
|
|
||||||
"bin": {
|
|
||||||
"codex-lsp": "dist/cli.js"
|
|
||||||
},
|
|
||||||
"scripts": {
|
|
||||||
"build": "tsc",
|
|
||||||
"dev": "tsc --watch",
|
|
||||||
"start": "node dist/cli.js",
|
|
||||||
"test": "vitest"
|
|
||||||
},
|
|
||||||
"dependencies": {
|
|
||||||
"@llm-gateway/client": "workspace:*",
|
|
||||||
"vscode-jsonrpc": "^8.0.0",
|
|
||||||
"vscode-languageserver": "^9.0.0",
|
|
||||||
"vscode-languageserver-protocol": "^3.17.0"
|
|
||||||
},
|
|
||||||
"devDependencies": {
|
|
||||||
"@types/node": "^20.0.0",
|
|
||||||
"typescript": "^5.0.0",
|
|
||||||
"vitest": "^1.0.0"
|
|
||||||
},
|
|
||||||
"keywords": [
|
|
||||||
"lsp",
|
|
||||||
"language-server",
|
|
||||||
"copilot",
|
|
||||||
"codex",
|
|
||||||
"llm",
|
|
||||||
"gateway"
|
|
||||||
],
|
|
||||||
"license": "MIT",
|
|
||||||
"author": "Rene Fichtmueller"
|
|
||||||
}
|
|
||||||
@ -1,13 +0,0 @@
|
|||||||
#!/usr/bin/env node
|
|
||||||
|
|
||||||
import CodexLSPAdapter from './index'
|
|
||||||
|
|
||||||
const adapter = new CodexLSPAdapter()
|
|
||||||
|
|
||||||
// Start the LSP server
|
|
||||||
adapter.start()
|
|
||||||
|
|
||||||
// Log startup
|
|
||||||
console.error('[Codex LSP] Server started on stdio')
|
|
||||||
console.error(`[Codex LSP] Gateway URL: ${process.env.GATEWAY_URL || 'default'}`)
|
|
||||||
console.error(`[Codex LSP] Ollama URL: ${process.env.OLLAMA_URL || '192.168.178.213:11434'}`)
|
|
||||||
@ -1,130 +0,0 @@
|
|||||||
import { createTIPClient } from '@llm-gateway/client'
|
|
||||||
import {
|
|
||||||
createConnection,
|
|
||||||
TextDocuments,
|
|
||||||
Diagnostic,
|
|
||||||
DiagnosticSeverity,
|
|
||||||
InitializeResult,
|
|
||||||
ServerCapabilities,
|
|
||||||
Position,
|
|
||||||
Range,
|
|
||||||
CompletionItem,
|
|
||||||
CompletionItemKind,
|
|
||||||
MarkupKind
|
|
||||||
} from 'vscode-languageserver'
|
|
||||||
import { TextDocument } from 'vscode-languageserver-textdocument'
|
|
||||||
|
|
||||||
export class CodexLSPAdapter {
|
|
||||||
private connection = createConnection()
|
|
||||||
private documents = new TextDocuments(TextDocument)
|
|
||||||
private client = createTIPClient({
|
|
||||||
agentId: 'codex-lsp-server',
|
|
||||||
ollamaUrl: process.env.OLLAMA_URL || '192.168.178.213:11434'
|
|
||||||
})
|
|
||||||
|
|
||||||
constructor() {
|
|
||||||
this.setupHandlers()
|
|
||||||
}
|
|
||||||
|
|
||||||
private setupHandlers() {
|
|
||||||
this.connection.onInitialize(this.handleInitialize.bind(this))
|
|
||||||
this.connection.onCompletion(this.handleCompletion.bind(this))
|
|
||||||
this.connection.onHover(this.handleHover.bind(this))
|
|
||||||
this.connection.onDefinition(this.handleDefinition.bind(this))
|
|
||||||
this.documents.onDidChangeContent(this.handleDocumentChange.bind(this))
|
|
||||||
this.documents.listen(this.connection)
|
|
||||||
}
|
|
||||||
|
|
||||||
private handleInitialize() {
|
|
||||||
const capabilities: ServerCapabilities = {
|
|
||||||
textDocumentSync: 1,
|
|
||||||
completionProvider: {
|
|
||||||
resolveProvider: true,
|
|
||||||
triggerCharacters: ['.', ' ', '(']
|
|
||||||
},
|
|
||||||
hoverProvider: true,
|
|
||||||
definitionProvider: true,
|
|
||||||
codeActionProvider: true,
|
|
||||||
executeCommandProvider: {
|
|
||||||
commands: ['codex.explain', 'codex.refactor', 'codex.test', 'codex.fix']
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
const result: InitializeResult = { capabilities }
|
|
||||||
return result
|
|
||||||
}
|
|
||||||
|
|
||||||
private async handleCompletion(params: any) {
|
|
||||||
const doc = this.documents.get(params.textDocument.uri)
|
|
||||||
if (!doc) return []
|
|
||||||
|
|
||||||
const position = params.position
|
|
||||||
const text = doc.getText()
|
|
||||||
const offset = doc.offsetAt(position)
|
|
||||||
|
|
||||||
try {
|
|
||||||
const response = await this.client.completion(
|
|
||||||
`Complete the following code:\n\n${text}\n\n[cursor here]`,
|
|
||||||
{ maxTokens: 500 }
|
|
||||||
)
|
|
||||||
|
|
||||||
return [
|
|
||||||
{
|
|
||||||
label: response.text.split('\n')[0],
|
|
||||||
kind: CompletionItemKind.Snippet,
|
|
||||||
documentation: {
|
|
||||||
kind: MarkupKind.Markdown,
|
|
||||||
value: `**Model**: ${response.model}\n**Confidence**: ${(response.confidence * 100).toFixed(1)}%`
|
|
||||||
},
|
|
||||||
insertText: response.text,
|
|
||||||
detail: response.fallback ? '(Ollama fallback)' : '(Gateway)'
|
|
||||||
} as CompletionItem
|
|
||||||
]
|
|
||||||
} catch (error) {
|
|
||||||
return []
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
private async handleHover(params: any) {
|
|
||||||
const doc = this.documents.get(params.textDocument.uri)
|
|
||||||
if (!doc) return null
|
|
||||||
|
|
||||||
const selectedText = doc.getText({
|
|
||||||
start: { line: params.position.line, character: 0 },
|
|
||||||
end: { line: params.position.line + 1, character: 0 }
|
|
||||||
})
|
|
||||||
|
|
||||||
try {
|
|
||||||
const response = await this.client.completion(
|
|
||||||
`Briefly explain this code:\n${selectedText}`,
|
|
||||||
{ maxTokens: 200 }
|
|
||||||
)
|
|
||||||
|
|
||||||
return {
|
|
||||||
contents: {
|
|
||||||
kind: MarkupKind.Markdown,
|
|
||||||
value: `${response.text}\n\n*${response.model} (${(response.confidence * 100).toFixed(0)}%)*`
|
|
||||||
}
|
|
||||||
}
|
|
||||||
} catch (error) {
|
|
||||||
return null
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
private async handleDefinition(params: any) {
|
|
||||||
// Definition lookup would be more complex in real implementation
|
|
||||||
// For now, return null - could integrate with symbol indexing
|
|
||||||
return null
|
|
||||||
}
|
|
||||||
|
|
||||||
private async handleDocumentChange(change: any) {
|
|
||||||
const doc = change.document
|
|
||||||
// Could perform diagnostics here on significant changes
|
|
||||||
}
|
|
||||||
|
|
||||||
start() {
|
|
||||||
this.connection.listen()
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
export default CodexLSPAdapter
|
|
||||||
@ -1,12 +0,0 @@
|
|||||||
{
|
|
||||||
"extends": "../../tsconfig.json",
|
|
||||||
"compilerOptions": {
|
|
||||||
"outDir": "./dist",
|
|
||||||
"rootDir": "./src",
|
|
||||||
"declaration": true,
|
|
||||||
"declarationMap": true,
|
|
||||||
"sourceMap": true
|
|
||||||
},
|
|
||||||
"include": ["src/**/*"],
|
|
||||||
"exclude": ["node_modules", "dist", "**/*.test.ts"]
|
|
||||||
}
|
|
||||||
@ -1,358 +0,0 @@
|
|||||||
# Learning System Integration
|
|
||||||
|
|
||||||
Per-agent metrics collection, feedback processing, and learning system integration for LLM Gateway.
|
|
||||||
|
|
||||||
## Overview
|
|
||||||
|
|
||||||
Extends the global learning system (Phase 2D) with per-agent signal isolation. Tracks metrics separately for each agent (Claude Code, Codex, ChatGPT, etc.) to enable agent-specific optimization and cost attribution.
|
|
||||||
|
|
||||||
## Installation
|
|
||||||
|
|
||||||
```bash
|
|
||||||
npm install @llm-gateway/learning-integration
|
|
||||||
```
|
|
||||||
|
|
||||||
## Core Concepts
|
|
||||||
|
|
||||||
### Per-Agent Metrics
|
|
||||||
|
|
||||||
Each agent maintains its own metric set tracking success, latency, cost, and confidence:
|
|
||||||
- **Success Rate**: % of requests that succeeded without fallback
|
|
||||||
- **Latency**: P50, P95, P99 response time (ms)
|
|
||||||
- **Cost**: Token consumption × model cost
|
|
||||||
- **Confidence**: Learned score 0-1 indicating model suitability for agent
|
|
||||||
|
|
||||||
### Feedback Loop
|
|
||||||
|
|
||||||
Agents report outcomes (success, fallback, error, timeout) enabling closed-loop learning:
|
|
||||||
- Adapter automatically tracks success/fallback
|
|
||||||
- Client can provide explicit feedback (quality, satisfaction)
|
|
||||||
- Learning engine uses feedback to update per-agent confidence scores
|
|
||||||
|
|
||||||
### Confidence Scoring
|
|
||||||
|
|
||||||
Per-agent confidence (independent of global score):
|
|
||||||
- Initialized from global baseline
|
|
||||||
- Updated hourly based on feedback
|
|
||||||
- Influences routing decisions (per-agent gate overrides global gate)
|
|
||||||
- Decays 10% per day if inactive
|
|
||||||
|
|
||||||
## Usage
|
|
||||||
|
|
||||||
### Basic Setup
|
|
||||||
|
|
||||||
```typescript
|
|
||||||
import { LearningIntegration } from '@llm-gateway/learning-integration'
|
|
||||||
import postgres from 'postgres'
|
|
||||||
|
|
||||||
const db = postgres({
|
|
||||||
host: 'localhost',
|
|
||||||
port: 5432,
|
|
||||||
database: 'llm_gateway'
|
|
||||||
})
|
|
||||||
|
|
||||||
const learning = new LearningIntegration(db)
|
|
||||||
|
|
||||||
// Initialize tables on startup
|
|
||||||
await learning.initializeTables()
|
|
||||||
```
|
|
||||||
|
|
||||||
### Logging Requests
|
|
||||||
|
|
||||||
```typescript
|
|
||||||
import { randomUUID } from 'crypto'
|
|
||||||
|
|
||||||
const requestId = randomUUID()
|
|
||||||
|
|
||||||
// After completion, log the request
|
|
||||||
await learning.logRequest({
|
|
||||||
requestId,
|
|
||||||
agentId: 'claude-code',
|
|
||||||
model: 'qwen2.5:14b',
|
|
||||||
latencyMs: 250,
|
|
||||||
tokensIn: 150,
|
|
||||||
tokensOut: 450,
|
|
||||||
confidence: 0.85,
|
|
||||||
fallbackUsed: false,
|
|
||||||
success: true
|
|
||||||
})
|
|
||||||
```
|
|
||||||
|
|
||||||
### Recording Feedback
|
|
||||||
|
|
||||||
```typescript
|
|
||||||
// Automatic (adapter tracks outcome)
|
|
||||||
await learning.recordFeedback({
|
|
||||||
requestId,
|
|
||||||
agentId: 'claude-code',
|
|
||||||
outcome: 'success',
|
|
||||||
completionQuality: 8, // 0-10
|
|
||||||
latencyMs: 250
|
|
||||||
})
|
|
||||||
|
|
||||||
// Explicit (from client UI)
|
|
||||||
await learning.recordFeedback({
|
|
||||||
requestId,
|
|
||||||
agentId: 'chatgpt',
|
|
||||||
outcome: 'success',
|
|
||||||
metadata: {
|
|
||||||
userSatisfaction: 9 // 0-10 from thumbs up/down
|
|
||||||
}
|
|
||||||
})
|
|
||||||
```
|
|
||||||
|
|
||||||
### Computing Metrics
|
|
||||||
|
|
||||||
```typescript
|
|
||||||
// Per-agent metrics (last 24h)
|
|
||||||
const metrics = await learning.getAgentMetrics('claude-code')
|
|
||||||
console.log(metrics)
|
|
||||||
// [{
|
|
||||||
// agentId: 'claude-code',
|
|
||||||
// model: 'qwen2.5:14b',
|
|
||||||
// requestCount: 1523,
|
|
||||||
// successRate: 0.98,
|
|
||||||
// avgLatencyMs: 245,
|
|
||||||
// totalTokens: 850000,
|
|
||||||
// costUsd: 85.00,
|
|
||||||
// confidence: 0.87,
|
|
||||||
// updatedAt: 2026-04-19T22:00:00Z
|
|
||||||
// }]
|
|
||||||
|
|
||||||
// Per-agent cost tracking
|
|
||||||
const costs = await learning.getAgentCosts(30) // 30 days
|
|
||||||
costs.forEach((cost, agentId) => {
|
|
||||||
console.log(`${agentId}: $${cost.toFixed(2)}`)
|
|
||||||
})
|
|
||||||
// claude-code: $892.50
|
|
||||||
// chatgpt: $1234.75
|
|
||||||
// codex: $345.20
|
|
||||||
|
|
||||||
// Anomaly detection
|
|
||||||
const anomalies = await learning.detectAnomalies('claude-code')
|
|
||||||
anomalies.forEach(a => {
|
|
||||||
console.log(`${a.model}: ${a.issue}`)
|
|
||||||
})
|
|
||||||
```
|
|
||||||
|
|
||||||
### SLO Monitoring
|
|
||||||
|
|
||||||
```typescript
|
|
||||||
import { PerAgentMetrics } from '@llm-gateway/learning-integration/metrics'
|
|
||||||
|
|
||||||
const metrics = new PerAgentMetrics(db)
|
|
||||||
|
|
||||||
// Check latency SLO
|
|
||||||
const slo = await metrics.checkLatencySLO('claude-code', 100) // Target: 100ms
|
|
||||||
console.log(slo)
|
|
||||||
// {
|
|
||||||
// agentId: 'claude-code',
|
|
||||||
// targetMs: 100,
|
|
||||||
// p50: 45,
|
|
||||||
// p95: 89,
|
|
||||||
// p99: 98,
|
|
||||||
// breached: false
|
|
||||||
// }
|
|
||||||
|
|
||||||
// Daily cost report
|
|
||||||
const costs = await metrics.generateDailyCostReport('2026-04-19')
|
|
||||||
console.log(costs)
|
|
||||||
// [{
|
|
||||||
// date: '2026-04-19',
|
|
||||||
// agentId: 'claude-code',
|
|
||||||
// tokensIn: 50000,
|
|
||||||
// tokensOut: 150000,
|
|
||||||
// costUsd: 20.00
|
|
||||||
// }]
|
|
||||||
```
|
|
||||||
|
|
||||||
### Feedback Processing
|
|
||||||
|
|
||||||
```typescript
|
|
||||||
import { FeedbackProcessor } from '@llm-gateway/learning-integration/feedback'
|
|
||||||
|
|
||||||
const feedback = new FeedbackProcessor(db)
|
|
||||||
|
|
||||||
// Process feedback from any source
|
|
||||||
await feedback.processFeedback({
|
|
||||||
requestId,
|
|
||||||
agentId: 'chatgpt',
|
|
||||||
outcome: 'success',
|
|
||||||
completionQuality: 9,
|
|
||||||
userSatisfaction: 10
|
|
||||||
})
|
|
||||||
|
|
||||||
// Get feedback stats
|
|
||||||
const stats = await feedback.getFeedbackStats('chatgpt')
|
|
||||||
console.log(stats)
|
|
||||||
// {
|
|
||||||
// agentId: 'chatgpt',
|
|
||||||
// totalFeedback: 2450,
|
|
||||||
// outcomeBreakdown: {
|
|
||||||
// success: 2350,
|
|
||||||
// fallback: 50,
|
|
||||||
// timeout: 25,
|
|
||||||
// error: 20,
|
|
||||||
// user_rejected: 5
|
|
||||||
// },
|
|
||||||
// avgQuality: 8.2,
|
|
||||||
// avgSatisfaction: 8.7
|
|
||||||
// }
|
|
||||||
|
|
||||||
// Compute confidence score from feedback
|
|
||||||
const score = await feedback.computeConfidenceScore('chatgpt', 'gpt-4')
|
|
||||||
console.log(`Confidence: ${score.toFixed(2)}`) // 0.87
|
|
||||||
```
|
|
||||||
|
|
||||||
## Database Schema
|
|
||||||
|
|
||||||
### agent_request_log
|
|
||||||
```sql
|
|
||||||
CREATE TABLE agent_request_log (
|
|
||||||
request_id UUID PRIMARY KEY,
|
|
||||||
agent_id VARCHAR(64) NOT NULL,
|
|
||||||
model VARCHAR(128) NOT NULL,
|
|
||||||
latency_ms INTEGER NOT NULL,
|
|
||||||
tokens_in INTEGER NOT NULL,
|
|
||||||
tokens_out INTEGER NOT NULL,
|
|
||||||
confidence DECIMAL(3, 2) NOT NULL,
|
|
||||||
fallback_used BOOLEAN NOT NULL DEFAULT FALSE,
|
|
||||||
success BOOLEAN NOT NULL DEFAULT TRUE,
|
|
||||||
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
|
|
||||||
INDEX idx_agent_model (agent_id, model),
|
|
||||||
INDEX idx_created (created_at)
|
|
||||||
)
|
|
||||||
```
|
|
||||||
|
|
||||||
### agent_feedback
|
|
||||||
```sql
|
|
||||||
CREATE TABLE agent_feedback (
|
|
||||||
id SERIAL PRIMARY KEY,
|
|
||||||
request_id UUID NOT NULL,
|
|
||||||
agent_id VARCHAR(64) NOT NULL,
|
|
||||||
outcome VARCHAR(32) NOT NULL,
|
|
||||||
completion_quality SMALLINT,
|
|
||||||
latency_ms INTEGER,
|
|
||||||
token_count INTEGER,
|
|
||||||
metadata JSONB,
|
|
||||||
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
|
|
||||||
FOREIGN KEY (request_id) REFERENCES agent_request_log (request_id),
|
|
||||||
INDEX idx_agent_outcome (agent_id, outcome),
|
|
||||||
INDEX idx_created (created_at)
|
|
||||||
)
|
|
||||||
```
|
|
||||||
|
|
||||||
### agent_confidence_scores
|
|
||||||
```sql
|
|
||||||
CREATE TABLE agent_confidence_scores (
|
|
||||||
id SERIAL PRIMARY KEY,
|
|
||||||
agent_id VARCHAR(64) NOT NULL,
|
|
||||||
model VARCHAR(128) NOT NULL,
|
|
||||||
score DECIMAL(3, 2) NOT NULL,
|
|
||||||
sample_size INTEGER NOT NULL DEFAULT 0,
|
|
||||||
trend VARCHAR(16) NOT NULL DEFAULT 'stable',
|
|
||||||
updated_at TIMESTAMP NOT NULL DEFAULT NOW(),
|
|
||||||
UNIQUE (agent_id, model),
|
|
||||||
INDEX idx_agent (agent_id)
|
|
||||||
)
|
|
||||||
```
|
|
||||||
|
|
||||||
## Integration with Learning Engine
|
|
||||||
|
|
||||||
### Learning Cycle (ADR-0003)
|
|
||||||
|
|
||||||
Per-agent metrics computed during learning cycles:
|
|
||||||
|
|
||||||
**Phase 2**: Aggregate global metrics (existing)
|
|
||||||
**Phase 2**: Compute per-agent slices (new)
|
|
||||||
```typescript
|
|
||||||
for (const agentId of knownAgents) {
|
|
||||||
const metrics = await learning.getAgentMetrics(agentId)
|
|
||||||
for (const metric of metrics) {
|
|
||||||
// Update per-agent confidence
|
|
||||||
const newScore = feedback.computeConfidenceScore(agentId, metric.model)
|
|
||||||
await learning.updateAgentConfidence(agentId, metric.model, newScore)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
**Phase 3**: Update per-agent confidence scores (new)
|
|
||||||
```typescript
|
|
||||||
for (const [agentId, model] of agentModelPairs) {
|
|
||||||
const score = await feedback.computeConfidenceScore(agentId, model)
|
|
||||||
const shouldUpdate = await feedback.shouldUpdateConfidence(agentId, model, score)
|
|
||||||
if (shouldUpdate) {
|
|
||||||
await learning.updateAgentConfidence(agentId, model, score)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
**Phase 5**: A/B test per-agent routing (new)
|
|
||||||
```typescript
|
|
||||||
// 10% of traffic uses per-agent routing
|
|
||||||
if (Math.random() < 0.1) {
|
|
||||||
const agentConfidence = await learning.getAgentConfidence(agentId, model)
|
|
||||||
if (agentConfidence && agentConfidence.score > 0.65) {
|
|
||||||
// Use per-agent routing decision
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
## Feedback Outcomes
|
|
||||||
|
|
||||||
| Outcome | Meaning | Auto | Manual |
|
|
||||||
|---------|---------|------|--------|
|
|
||||||
| `success` | Request succeeded, no fallback | Yes | Yes |
|
|
||||||
| `fallback` | Gateway unavailable, used Ollama | Yes | - |
|
|
||||||
| `timeout` | Request exceeded timeout | Yes | - |
|
|
||||||
| `error` | Request failed with error | Yes | Yes |
|
|
||||||
| `user_rejected` | Client explicitly rejected response | - | Yes |
|
|
||||||
|
|
||||||
## Cost Attribution
|
|
||||||
|
|
||||||
Monthly cost per agent (token-based):
|
|
||||||
|
|
||||||
```
|
|
||||||
Cost = (tokens_in + tokens_out) × model_rate × 0.0001
|
|
||||||
```
|
|
||||||
|
|
||||||
Default rates:
|
|
||||||
- qwen2.5:3b = $0.0001 per 1K tokens
|
|
||||||
- qwen2.5:14b = $0.0001 per 1K tokens
|
|
||||||
- qwen2.5:32b = $0.0001 per 1K tokens
|
|
||||||
|
|
||||||
Configurable via learning engine cost config.
|
|
||||||
|
|
||||||
## Testing
|
|
||||||
|
|
||||||
```bash
|
|
||||||
npm test
|
|
||||||
```
|
|
||||||
|
|
||||||
Tests cover:
|
|
||||||
- Per-agent metric computation
|
|
||||||
- Feedback ingestion and processing
|
|
||||||
- Confidence score calculation
|
|
||||||
- Anomaly detection
|
|
||||||
- Cost attribution
|
|
||||||
- SLO monitoring
|
|
||||||
- Trending analysis
|
|
||||||
|
|
||||||
## Performance
|
|
||||||
|
|
||||||
- Request logging: <1ms per insertion
|
|
||||||
- Feedback processing: <1ms per insertion
|
|
||||||
- Metric computation (24h): 100-500ms per agent
|
|
||||||
- Cost report generation: 500ms-1s for all agents
|
|
||||||
- Anomaly detection: 1-2s per agent
|
|
||||||
|
|
||||||
## Related ADRs
|
|
||||||
- [ADR-0002](../adr/0002-tier-assignment-strategy.md) — Tier assignment (per-agent override)
|
|
||||||
- [ADR-0003](../adr/0003-confidence-gate-thresholds.md) — Confidence gate (per-agent gate)
|
|
||||||
- [ADR-0006](../adr/0006-learning-system-integration.md) — Learning system specification
|
|
||||||
|
|
||||||
## Security Notes
|
|
||||||
|
|
||||||
- Agent IDs are stored plaintext (consider hashing for privacy-sensitive deployments)
|
|
||||||
- User satisfaction scores in metadata (consider encryption at rest)
|
|
||||||
- Cost reports are per-agent (may expose usage patterns)
|
|
||||||
@ -1,37 +0,0 @@
|
|||||||
{
|
|
||||||
"name": "@llm-gateway/learning-integration",
|
|
||||||
"version": "1.0.0",
|
|
||||||
"description": "Per-agent learning metrics and feedback integration for LLM Gateway",
|
|
||||||
"type": "module",
|
|
||||||
"main": "dist/index.js",
|
|
||||||
"exports": {
|
|
||||||
".": "./dist/index.js",
|
|
||||||
"./metrics": "./dist/metrics.js",
|
|
||||||
"./feedback": "./dist/feedback.js"
|
|
||||||
},
|
|
||||||
"scripts": {
|
|
||||||
"build": "tsc",
|
|
||||||
"dev": "tsc --watch",
|
|
||||||
"test": "vitest"
|
|
||||||
},
|
|
||||||
"dependencies": {
|
|
||||||
"@llm-gateway/client": "workspace:*",
|
|
||||||
"@llm-gateway/learning": "workspace:*",
|
|
||||||
"postgres": "^3.0.0"
|
|
||||||
},
|
|
||||||
"devDependencies": {
|
|
||||||
"@types/node": "^20.0.0",
|
|
||||||
"typescript": "^5.0.0",
|
|
||||||
"vitest": "^1.0.0"
|
|
||||||
},
|
|
||||||
"keywords": [
|
|
||||||
"learning",
|
|
||||||
"metrics",
|
|
||||||
"feedback",
|
|
||||||
"per-agent",
|
|
||||||
"llm",
|
|
||||||
"gateway"
|
|
||||||
],
|
|
||||||
"license": "MIT",
|
|
||||||
"author": "Rene Fichtmueller"
|
|
||||||
}
|
|
||||||
@ -1,215 +0,0 @@
|
|||||||
import { Client } from 'postgres'
|
|
||||||
|
|
||||||
export type FeedbackOutcome = 'success' | 'fallback' | 'timeout' | 'error' | 'user_rejected'
|
|
||||||
|
|
||||||
export interface FeedbackRequest {
|
|
||||||
requestId: string
|
|
||||||
agentId: string
|
|
||||||
outcome: FeedbackOutcome
|
|
||||||
completionQuality?: number // 0-10
|
|
||||||
latencyMs?: number
|
|
||||||
tokenCount?: number
|
|
||||||
userSatisfaction?: number // 0-10 from UI
|
|
||||||
metadata?: Record<string, unknown>
|
|
||||||
}
|
|
||||||
|
|
||||||
export interface FeedbackStats {
|
|
||||||
agentId: string
|
|
||||||
totalFeedback: number
|
|
||||||
outcomeBreakdown: Record<FeedbackOutcome, number>
|
|
||||||
avgQuality: number
|
|
||||||
avgSatisfaction: number
|
|
||||||
}
|
|
||||||
|
|
||||||
export class FeedbackProcessor {
|
|
||||||
constructor(private db: Client) {}
|
|
||||||
|
|
||||||
async processFeedback(feedback: FeedbackRequest): Promise<void> {
|
|
||||||
const timestamp = new Date()
|
|
||||||
|
|
||||||
await this.db`
|
|
||||||
INSERT INTO agent_feedback (
|
|
||||||
request_id, agent_id, outcome, completion_quality, latency_ms,
|
|
||||||
token_count, metadata, created_at
|
|
||||||
) VALUES (
|
|
||||||
${feedback.requestId},
|
|
||||||
${feedback.agentId},
|
|
||||||
${feedback.outcome},
|
|
||||||
${feedback.completionQuality || null},
|
|
||||||
${feedback.latencyMs || null},
|
|
||||||
${feedback.tokenCount || null},
|
|
||||||
${JSON.stringify({
|
|
||||||
userSatisfaction: feedback.userSatisfaction,
|
|
||||||
...feedback.metadata
|
|
||||||
})},
|
|
||||||
${timestamp}
|
|
||||||
)
|
|
||||||
`
|
|
||||||
}
|
|
||||||
|
|
||||||
async getFeedbackStats(
|
|
||||||
agentId: string,
|
|
||||||
hours: number = 24
|
|
||||||
): Promise<FeedbackStats> {
|
|
||||||
const cutoff = new Date(Date.now() - hours * 60 * 60 * 1000)
|
|
||||||
|
|
||||||
const results = await this.db`
|
|
||||||
SELECT
|
|
||||||
outcome,
|
|
||||||
COUNT(*) as count,
|
|
||||||
AVG(completion_quality) as avg_quality,
|
|
||||||
AVG((metadata->>'userSatisfaction')::int) as avg_satisfaction
|
|
||||||
FROM agent_feedback
|
|
||||||
WHERE agent_id = ${agentId} AND created_at > ${cutoff}
|
|
||||||
GROUP BY outcome
|
|
||||||
`
|
|
||||||
|
|
||||||
const outcomeBreakdown: Record<FeedbackOutcome, number> = {
|
|
||||||
success: 0,
|
|
||||||
fallback: 0,
|
|
||||||
timeout: 0,
|
|
||||||
error: 0,
|
|
||||||
user_rejected: 0
|
|
||||||
}
|
|
||||||
|
|
||||||
let totalFeedback = 0
|
|
||||||
let totalQuality = 0
|
|
||||||
let qualityCount = 0
|
|
||||||
let totalSatisfaction = 0
|
|
||||||
let satisfactionCount = 0
|
|
||||||
|
|
||||||
for (const row of results as any[]) {
|
|
||||||
const outcome = row.outcome as FeedbackOutcome
|
|
||||||
const count = Number(row.count)
|
|
||||||
outcomeBreakdown[outcome] = count
|
|
||||||
totalFeedback += count
|
|
||||||
|
|
||||||
if (row.avg_quality) {
|
|
||||||
totalQuality += Number(row.avg_quality)
|
|
||||||
qualityCount++
|
|
||||||
}
|
|
||||||
if (row.avg_satisfaction) {
|
|
||||||
totalSatisfaction += Number(row.avg_satisfaction)
|
|
||||||
satisfactionCount++
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return {
|
|
||||||
agentId,
|
|
||||||
totalFeedback,
|
|
||||||
outcomeBreakdown,
|
|
||||||
avgQuality: qualityCount > 0 ? totalQuality / qualityCount : 0,
|
|
||||||
avgSatisfaction: satisfactionCount > 0 ? totalSatisfaction / satisfactionCount : 0
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
async getOutcomeDistribution(
|
|
||||||
agentId: string,
|
|
||||||
hours: number = 24
|
|
||||||
): Promise<{ outcome: FeedbackOutcome; percentage: number }[]> {
|
|
||||||
const cutoff = new Date(Date.now() - hours * 60 * 60 * 1000)
|
|
||||||
|
|
||||||
const results = await this.db`
|
|
||||||
SELECT outcome, COUNT(*) as count
|
|
||||||
FROM agent_feedback
|
|
||||||
WHERE agent_id = ${agentId} AND created_at > ${cutoff}
|
|
||||||
GROUP BY outcome
|
|
||||||
`
|
|
||||||
|
|
||||||
const total = results.reduce((sum, row: any) => sum + Number(row.count), 0)
|
|
||||||
if (total === 0) return []
|
|
||||||
|
|
||||||
return results.map((row: any) => ({
|
|
||||||
outcome: row.outcome as FeedbackOutcome,
|
|
||||||
percentage: (Number(row.count) / total) * 100
|
|
||||||
}))
|
|
||||||
}
|
|
||||||
|
|
||||||
async identifySuccessfulModels(agentId: string): Promise<string[]> {
|
|
||||||
const cutoff = new Date(Date.now() - 24 * 60 * 60 * 1000)
|
|
||||||
|
|
||||||
const results = await this.db`
|
|
||||||
SELECT DISTINCT log.model
|
|
||||||
FROM agent_request_log log
|
|
||||||
INNER JOIN agent_feedback fb ON log.request_id = fb.request_id
|
|
||||||
WHERE log.agent_id = ${agentId}
|
|
||||||
AND log.created_at > ${cutoff}
|
|
||||||
AND fb.outcome = 'success'
|
|
||||||
ORDER BY log.model
|
|
||||||
`
|
|
||||||
|
|
||||||
return results.map((row: any) => row.model)
|
|
||||||
}
|
|
||||||
|
|
||||||
async computeConfidenceScore(
|
|
||||||
agentId: string,
|
|
||||||
model: string
|
|
||||||
): Promise<number> {
|
|
||||||
const day = new Date()
|
|
||||||
day.setDate(day.getDate() - 1)
|
|
||||||
|
|
||||||
const feedback = await this.db`
|
|
||||||
SELECT
|
|
||||||
COUNT(CASE WHEN outcome = 'success' THEN 1 END)::float / COUNT(*) as success_rate,
|
|
||||||
AVG(completion_quality) as avg_quality,
|
|
||||||
AVG((metadata->>'userSatisfaction')::int) as avg_satisfaction
|
|
||||||
FROM agent_feedback fb
|
|
||||||
INNER JOIN agent_request_log log ON fb.request_id = log.request_id
|
|
||||||
WHERE fb.agent_id = ${agentId}
|
|
||||||
AND log.model = ${model}
|
|
||||||
AND fb.created_at > ${day}
|
|
||||||
`
|
|
||||||
|
|
||||||
if (feedback.length === 0) return 0.5 // Default neutral confidence
|
|
||||||
|
|
||||||
const row = feedback[0] as any
|
|
||||||
const successRate = Number(row.success_rate || 0)
|
|
||||||
const avgQuality = Number(row.avg_quality || 5) / 10 // Normalize to 0-1
|
|
||||||
const avgSatisfaction = Number(row.avg_satisfaction || 5) / 10 // Normalize to 0-1
|
|
||||||
|
|
||||||
// Weighted average: 50% success, 25% quality, 25% satisfaction
|
|
||||||
const score = successRate * 0.5 + avgQuality * 0.25 + avgSatisfaction * 0.25
|
|
||||||
return Math.min(1, Math.max(0, score))
|
|
||||||
}
|
|
||||||
|
|
||||||
async shouldUpdateConfidence(
|
|
||||||
agentId: string,
|
|
||||||
model: string,
|
|
||||||
newScore: number,
|
|
||||||
threshold: number = 0.1
|
|
||||||
): Promise<boolean> {
|
|
||||||
// Get current confidence
|
|
||||||
const current = await this.db`
|
|
||||||
SELECT score FROM agent_confidence_scores
|
|
||||||
WHERE agent_id = ${agentId} AND model = ${model}
|
|
||||||
`
|
|
||||||
|
|
||||||
if (current.length === 0) return true // Always update if no history
|
|
||||||
|
|
||||||
const currentScore = Number(current[0].score)
|
|
||||||
return Math.abs(newScore - currentScore) > threshold
|
|
||||||
}
|
|
||||||
|
|
||||||
async processUserFeedback(
|
|
||||||
requestId: string,
|
|
||||||
agentId: string,
|
|
||||||
satisfaction: number // 0-10
|
|
||||||
): Promise<void> {
|
|
||||||
const metadata = { userSatisfaction: satisfaction }
|
|
||||||
|
|
||||||
await this.db`
|
|
||||||
INSERT INTO agent_feedback (
|
|
||||||
request_id, agent_id, outcome, metadata
|
|
||||||
) VALUES (
|
|
||||||
${requestId},
|
|
||||||
${agentId},
|
|
||||||
'success',
|
|
||||||
${JSON.stringify(metadata)}
|
|
||||||
)
|
|
||||||
ON CONFLICT (request_id) DO UPDATE SET
|
|
||||||
metadata = ${JSON.stringify(metadata)}
|
|
||||||
`
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
export default FeedbackProcessor
|
|
||||||
@ -1,267 +0,0 @@
|
|||||||
import { sql } from 'postgres'
|
|
||||||
import { Client } from 'postgres'
|
|
||||||
|
|
||||||
export interface AgentMetrics {
|
|
||||||
agentId: string
|
|
||||||
model: string
|
|
||||||
requestCount: number
|
|
||||||
successRate: number
|
|
||||||
avgLatencyMs: number
|
|
||||||
totalTokens: number
|
|
||||||
costUsd: number
|
|
||||||
confidence: number
|
|
||||||
updatedAt: Date
|
|
||||||
}
|
|
||||||
|
|
||||||
export interface RequestLog {
|
|
||||||
requestId: string
|
|
||||||
agentId: string
|
|
||||||
model: string
|
|
||||||
latencyMs: number
|
|
||||||
tokensIn: number
|
|
||||||
tokensOut: number
|
|
||||||
confidence: number
|
|
||||||
fallbackUsed: boolean
|
|
||||||
success: boolean
|
|
||||||
timestamp: Date
|
|
||||||
}
|
|
||||||
|
|
||||||
export interface AgentFeedback {
|
|
||||||
requestId: string
|
|
||||||
agentId: string
|
|
||||||
outcome: 'success' | 'fallback' | 'timeout' | 'error' | 'user_rejected'
|
|
||||||
completionQuality?: number
|
|
||||||
latencyMs?: number
|
|
||||||
tokenCount?: number
|
|
||||||
metadata?: Record<string, unknown>
|
|
||||||
timestamp: Date
|
|
||||||
}
|
|
||||||
|
|
||||||
export interface PerAgentConfidence {
|
|
||||||
agentId: string
|
|
||||||
model: string
|
|
||||||
score: number
|
|
||||||
sampleSize: number
|
|
||||||
lastUpdated: Date
|
|
||||||
trend: 'improving' | 'stable' | 'degrading'
|
|
||||||
}
|
|
||||||
|
|
||||||
export class LearningIntegration {
|
|
||||||
private db: Client
|
|
||||||
|
|
||||||
constructor(dbConnection: Client) {
|
|
||||||
this.db = dbConnection
|
|
||||||
}
|
|
||||||
|
|
||||||
async initializeTables(): Promise<void> {
|
|
||||||
await this.db`
|
|
||||||
CREATE TABLE IF NOT EXISTS agent_request_log (
|
|
||||||
request_id UUID PRIMARY KEY,
|
|
||||||
agent_id VARCHAR(64) NOT NULL,
|
|
||||||
model VARCHAR(128) NOT NULL,
|
|
||||||
latency_ms INTEGER NOT NULL,
|
|
||||||
tokens_in INTEGER NOT NULL,
|
|
||||||
tokens_out INTEGER NOT NULL,
|
|
||||||
confidence DECIMAL(3, 2) NOT NULL,
|
|
||||||
fallback_used BOOLEAN NOT NULL DEFAULT FALSE,
|
|
||||||
success BOOLEAN NOT NULL DEFAULT TRUE,
|
|
||||||
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
|
|
||||||
INDEX idx_agent_model (agent_id, model),
|
|
||||||
INDEX idx_created (created_at)
|
|
||||||
)
|
|
||||||
`
|
|
||||||
|
|
||||||
await this.db`
|
|
||||||
CREATE TABLE IF NOT EXISTS agent_feedback (
|
|
||||||
id SERIAL PRIMARY KEY,
|
|
||||||
request_id UUID NOT NULL,
|
|
||||||
agent_id VARCHAR(64) NOT NULL,
|
|
||||||
outcome VARCHAR(32) NOT NULL,
|
|
||||||
completion_quality SMALLINT,
|
|
||||||
latency_ms INTEGER,
|
|
||||||
token_count INTEGER,
|
|
||||||
metadata JSONB,
|
|
||||||
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
|
|
||||||
FOREIGN KEY (request_id) REFERENCES agent_request_log (request_id),
|
|
||||||
INDEX idx_agent_outcome (agent_id, outcome),
|
|
||||||
INDEX idx_created (created_at)
|
|
||||||
)
|
|
||||||
`
|
|
||||||
|
|
||||||
await this.db`
|
|
||||||
CREATE TABLE IF NOT EXISTS agent_confidence_scores (
|
|
||||||
id SERIAL PRIMARY KEY,
|
|
||||||
agent_id VARCHAR(64) NOT NULL,
|
|
||||||
model VARCHAR(128) NOT NULL,
|
|
||||||
score DECIMAL(3, 2) NOT NULL,
|
|
||||||
sample_size INTEGER NOT NULL DEFAULT 0,
|
|
||||||
trend VARCHAR(16) NOT NULL DEFAULT 'stable',
|
|
||||||
updated_at TIMESTAMP NOT NULL DEFAULT NOW(),
|
|
||||||
UNIQUE (agent_id, model),
|
|
||||||
INDEX idx_agent (agent_id)
|
|
||||||
)
|
|
||||||
`
|
|
||||||
}
|
|
||||||
|
|
||||||
async logRequest(log: Omit<RequestLog, 'timestamp'>): Promise<void> {
|
|
||||||
await this.db`
|
|
||||||
INSERT INTO agent_request_log (
|
|
||||||
request_id, agent_id, model, latency_ms, tokens_in, tokens_out,
|
|
||||||
confidence, fallback_used, success
|
|
||||||
) VALUES (
|
|
||||||
${log.requestId}, ${log.agentId}, ${log.model}, ${log.latencyMs},
|
|
||||||
${log.tokensIn}, ${log.tokensOut}, ${log.confidence},
|
|
||||||
${log.fallbackUsed}, ${log.success}
|
|
||||||
)
|
|
||||||
`
|
|
||||||
}
|
|
||||||
|
|
||||||
async recordFeedback(feedback: Omit<AgentFeedback, 'timestamp'>): Promise<void> {
|
|
||||||
await this.db`
|
|
||||||
INSERT INTO agent_feedback (
|
|
||||||
request_id, agent_id, outcome, completion_quality, latency_ms,
|
|
||||||
token_count, metadata
|
|
||||||
) VALUES (
|
|
||||||
${feedback.requestId}, ${feedback.agentId}, ${feedback.outcome},
|
|
||||||
${feedback.completionQuality || null}, ${feedback.latencyMs || null},
|
|
||||||
${feedback.tokenCount || null}, ${JSON.stringify(feedback.metadata || {})}
|
|
||||||
)
|
|
||||||
`
|
|
||||||
}
|
|
||||||
|
|
||||||
async getAgentMetrics(agentId: string, hours: number = 24): Promise<AgentMetrics[]> {
|
|
||||||
const cutoffTime = new Date(Date.now() - hours * 60 * 60 * 1000)
|
|
||||||
|
|
||||||
const results = await this.db`
|
|
||||||
SELECT
|
|
||||||
agent_id,
|
|
||||||
model,
|
|
||||||
COUNT(*) as request_count,
|
|
||||||
COUNT(CASE WHEN success = true THEN 1 END)::float / COUNT(*) as success_rate,
|
|
||||||
AVG(latency_ms)::float as avg_latency_ms,
|
|
||||||
SUM(tokens_in + tokens_out) as total_tokens,
|
|
||||||
SUM(tokens_in + tokens_out) * 0.0001 as cost_usd,
|
|
||||||
AVG(confidence)::float as confidence,
|
|
||||||
MAX(created_at) as updated_at
|
|
||||||
FROM agent_request_log
|
|
||||||
WHERE agent_id = ${agentId} AND created_at > ${cutoffTime}
|
|
||||||
GROUP BY agent_id, model
|
|
||||||
ORDER BY request_count DESC
|
|
||||||
`
|
|
||||||
|
|
||||||
return results.map((row: any) => ({
|
|
||||||
agentId: row.agent_id,
|
|
||||||
model: row.model,
|
|
||||||
requestCount: Number(row.request_count),
|
|
||||||
successRate: Number(row.success_rate),
|
|
||||||
avgLatencyMs: Number(row.avg_latency_ms),
|
|
||||||
totalTokens: Number(row.total_tokens),
|
|
||||||
costUsd: Number(row.cost_usd),
|
|
||||||
confidence: Number(row.confidence),
|
|
||||||
updatedAt: new Date(row.updated_at)
|
|
||||||
}))
|
|
||||||
}
|
|
||||||
|
|
||||||
async updateAgentConfidence(
|
|
||||||
agentId: string,
|
|
||||||
model: string,
|
|
||||||
newScore: number
|
|
||||||
): Promise<void> {
|
|
||||||
await this.db`
|
|
||||||
INSERT INTO agent_confidence_scores (agent_id, model, score, sample_size)
|
|
||||||
VALUES (${agentId}, ${model}, ${newScore}, 1)
|
|
||||||
ON CONFLICT (agent_id, model)
|
|
||||||
DO UPDATE SET
|
|
||||||
score = ${newScore},
|
|
||||||
sample_size = agent_confidence_scores.sample_size + 1,
|
|
||||||
updated_at = NOW()
|
|
||||||
`
|
|
||||||
}
|
|
||||||
|
|
||||||
async getAgentConfidence(agentId: string, model: string): Promise<PerAgentConfidence | null> {
|
|
||||||
const results = await this.db`
|
|
||||||
SELECT * FROM agent_confidence_scores
|
|
||||||
WHERE agent_id = ${agentId} AND model = ${model}
|
|
||||||
`
|
|
||||||
|
|
||||||
if (results.length === 0) return null
|
|
||||||
|
|
||||||
const row = results[0] as any
|
|
||||||
return {
|
|
||||||
agentId: row.agent_id,
|
|
||||||
model: row.model,
|
|
||||||
score: Number(row.score),
|
|
||||||
sampleSize: Number(row.sample_size),
|
|
||||||
lastUpdated: new Date(row.updated_at),
|
|
||||||
trend: row.trend as 'improving' | 'stable' | 'degrading'
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
async computePerAgentMetrics(agentId: string): Promise<AgentMetrics[]> {
|
|
||||||
// Compute metrics for past 24 hours
|
|
||||||
return this.getAgentMetrics(agentId, 24)
|
|
||||||
}
|
|
||||||
|
|
||||||
async getAgentCosts(days: number = 30): Promise<Map<string, number>> {
|
|
||||||
const cutoffTime = new Date(Date.now() - days * 24 * 60 * 60 * 1000)
|
|
||||||
|
|
||||||
const results = await this.db`
|
|
||||||
SELECT
|
|
||||||
agent_id,
|
|
||||||
SUM(tokens_in + tokens_out) * 0.0001 as cost_usd
|
|
||||||
FROM agent_request_log
|
|
||||||
WHERE created_at > ${cutoffTime}
|
|
||||||
GROUP BY agent_id
|
|
||||||
ORDER BY cost_usd DESC
|
|
||||||
`
|
|
||||||
|
|
||||||
const costs = new Map<string, number>()
|
|
||||||
for (const row of results as any[]) {
|
|
||||||
costs.set(row.agent_id, Number(row.cost_usd))
|
|
||||||
}
|
|
||||||
return costs
|
|
||||||
}
|
|
||||||
|
|
||||||
async detectAnomalies(
|
|
||||||
agentId: string,
|
|
||||||
threshold: number = 2
|
|
||||||
): Promise<{ model: string; issue: string }[]> {
|
|
||||||
const metrics = await this.getAgentMetrics(agentId, 24)
|
|
||||||
const baseline = await this.getAgentMetrics(agentId, 24 * 30) // 30-day baseline
|
|
||||||
|
|
||||||
const anomalies: { model: string; issue: string }[] = []
|
|
||||||
|
|
||||||
for (const current of metrics) {
|
|
||||||
const baselineMetric = baseline.find(m => m.model === current.model)
|
|
||||||
if (!baselineMetric) continue
|
|
||||||
|
|
||||||
// Check latency spike
|
|
||||||
if (current.avgLatencyMs > baselineMetric.avgLatencyMs * (1 + threshold * 0.1)) {
|
|
||||||
anomalies.push({
|
|
||||||
model: current.model,
|
|
||||||
issue: `Latency spike: ${current.avgLatencyMs}ms (baseline: ${baselineMetric.avgLatencyMs}ms)`
|
|
||||||
})
|
|
||||||
}
|
|
||||||
|
|
||||||
// Check success rate drop
|
|
||||||
if (current.successRate < baselineMetric.successRate * (1 - threshold * 0.1)) {
|
|
||||||
anomalies.push({
|
|
||||||
model: current.model,
|
|
||||||
issue: `Success rate drop: ${(current.successRate * 100).toFixed(1)}% (baseline: ${(baselineMetric.successRate * 100).toFixed(1)}%)`
|
|
||||||
})
|
|
||||||
}
|
|
||||||
|
|
||||||
// Check confidence drop
|
|
||||||
if (current.confidence < baselineMetric.confidence * 0.8) {
|
|
||||||
anomalies.push({
|
|
||||||
model: current.model,
|
|
||||||
issue: `Confidence degradation: ${current.confidence.toFixed(2)} (baseline: ${baselineMetric.confidence.toFixed(2)})`
|
|
||||||
})
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return anomalies
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
export default LearningIntegration
|
|
||||||
@ -1,248 +0,0 @@
|
|||||||
import { Client } from 'postgres'
|
|
||||||
import { sql } from 'postgres'
|
|
||||||
|
|
||||||
export interface DailyAgentCost {
|
|
||||||
date: string
|
|
||||||
agentId: string
|
|
||||||
tokensIn: number
|
|
||||||
tokensOut: number
|
|
||||||
costUsd: number
|
|
||||||
}
|
|
||||||
|
|
||||||
export interface LatencySLO {
|
|
||||||
agentId: string
|
|
||||||
targetMs: number
|
|
||||||
p50: number
|
|
||||||
p95: number
|
|
||||||
p99: number
|
|
||||||
breached: boolean
|
|
||||||
}
|
|
||||||
|
|
||||||
export class PerAgentMetrics {
|
|
||||||
constructor(private db: Client) {}
|
|
||||||
|
|
||||||
async generateDailyCostReport(date: string): Promise<DailyAgentCost[]> {
|
|
||||||
const startOfDay = new Date(date)
|
|
||||||
startOfDay.setHours(0, 0, 0, 0)
|
|
||||||
const endOfDay = new Date(date)
|
|
||||||
endOfDay.setHours(23, 59, 59, 999)
|
|
||||||
|
|
||||||
const results = await this.db`
|
|
||||||
SELECT
|
|
||||||
agent_id,
|
|
||||||
SUM(tokens_in) as tokens_in,
|
|
||||||
SUM(tokens_out) as tokens_out,
|
|
||||||
(SUM(tokens_in) + SUM(tokens_out)) * 0.0001 as cost_usd
|
|
||||||
FROM agent_request_log
|
|
||||||
WHERE created_at >= ${startOfDay} AND created_at < ${endOfDay}
|
|
||||||
GROUP BY agent_id
|
|
||||||
ORDER BY cost_usd DESC
|
|
||||||
`
|
|
||||||
|
|
||||||
return results.map((row: any) => ({
|
|
||||||
date,
|
|
||||||
agentId: row.agent_id,
|
|
||||||
tokensIn: Number(row.tokens_in),
|
|
||||||
tokensOut: Number(row.tokens_out),
|
|
||||||
costUsd: Number(row.cost_usd)
|
|
||||||
}))
|
|
||||||
}
|
|
||||||
|
|
||||||
async checkLatencySLO(
|
|
||||||
agentId: string,
|
|
||||||
targetMs: number = 500
|
|
||||||
): Promise<LatencySLO> {
|
|
||||||
const hour = new Date()
|
|
||||||
hour.setHours(hour.getHours() - 1)
|
|
||||||
|
|
||||||
const results = await this.db`
|
|
||||||
SELECT
|
|
||||||
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY latency_ms) as p50,
|
|
||||||
PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY latency_ms) as p95,
|
|
||||||
PERCENTILE_CONT(0.99) WITHIN GROUP (ORDER BY latency_ms) as p99
|
|
||||||
FROM agent_request_log
|
|
||||||
WHERE agent_id = ${agentId} AND created_at > ${hour}
|
|
||||||
`
|
|
||||||
|
|
||||||
if (results.length === 0) {
|
|
||||||
return {
|
|
||||||
agentId,
|
|
||||||
targetMs,
|
|
||||||
p50: 0,
|
|
||||||
p95: 0,
|
|
||||||
p99: 0,
|
|
||||||
breached: false
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
const row = results[0] as any
|
|
||||||
const p99 = Number(row.p99 || 0)
|
|
||||||
const breached = p99 > targetMs
|
|
||||||
|
|
||||||
return {
|
|
||||||
agentId,
|
|
||||||
targetMs,
|
|
||||||
p50: Number(row.p50 || 0),
|
|
||||||
p95: Number(row.p95 || 0),
|
|
||||||
p99,
|
|
||||||
breached
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
async getAgentSuccessRate(
|
|
||||||
agentId: string,
|
|
||||||
hours: number = 24
|
|
||||||
): Promise<{ total: number; successful: number; rate: number }> {
|
|
||||||
const cutoff = new Date(Date.now() - hours * 60 * 60 * 1000)
|
|
||||||
|
|
||||||
const results = await this.db`
|
|
||||||
SELECT
|
|
||||||
COUNT(*) as total,
|
|
||||||
COUNT(CASE WHEN success = true THEN 1 END) as successful
|
|
||||||
FROM agent_request_log
|
|
||||||
WHERE agent_id = ${agentId} AND created_at > ${cutoff}
|
|
||||||
`
|
|
||||||
|
|
||||||
if (results.length === 0) {
|
|
||||||
return { total: 0, successful: 0, rate: 0 }
|
|
||||||
}
|
|
||||||
|
|
||||||
const row = results[0] as any
|
|
||||||
const total = Number(row.total)
|
|
||||||
const successful = Number(row.successful)
|
|
||||||
|
|
||||||
return {
|
|
||||||
total,
|
|
||||||
successful,
|
|
||||||
rate: total > 0 ? successful / total : 0
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
async getFallbackRate(
|
|
||||||
agentId: string,
|
|
||||||
hours: number = 24
|
|
||||||
): Promise<{ fallbacks: number; total: number; rate: number }> {
|
|
||||||
const cutoff = new Date(Date.now() - hours * 60 * 60 * 1000)
|
|
||||||
|
|
||||||
const results = await this.db`
|
|
||||||
SELECT
|
|
||||||
COUNT(*) as total,
|
|
||||||
COUNT(CASE WHEN fallback_used = true THEN 1 END) as fallbacks
|
|
||||||
FROM agent_request_log
|
|
||||||
WHERE agent_id = ${agentId} AND created_at > ${cutoff}
|
|
||||||
`
|
|
||||||
|
|
||||||
if (results.length === 0) {
|
|
||||||
return { fallbacks: 0, total: 0, rate: 0 }
|
|
||||||
}
|
|
||||||
|
|
||||||
const row = results[0] as any
|
|
||||||
const total = Number(row.total)
|
|
||||||
const fallbacks = Number(row.fallbacks)
|
|
||||||
|
|
||||||
return {
|
|
||||||
fallbacks,
|
|
||||||
total,
|
|
||||||
rate: total > 0 ? fallbacks / total : 0
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
async getModelPerformance(
|
|
||||||
agentId: string,
|
|
||||||
model: string,
|
|
||||||
hours: number = 24
|
|
||||||
): Promise<{
|
|
||||||
model: string
|
|
||||||
requests: number
|
|
||||||
avgLatency: number
|
|
||||||
successRate: number
|
|
||||||
confidence: number
|
|
||||||
}> {
|
|
||||||
const cutoff = new Date(Date.now() - hours * 60 * 60 * 1000)
|
|
||||||
|
|
||||||
const results = await this.db`
|
|
||||||
SELECT
|
|
||||||
COUNT(*) as requests,
|
|
||||||
AVG(latency_ms) as avg_latency,
|
|
||||||
COUNT(CASE WHEN success = true THEN 1 END)::float / COUNT(*) as success_rate,
|
|
||||||
AVG(confidence) as confidence
|
|
||||||
FROM agent_request_log
|
|
||||||
WHERE agent_id = ${agentId} AND model = ${model} AND created_at > ${cutoff}
|
|
||||||
`
|
|
||||||
|
|
||||||
if (results.length === 0) {
|
|
||||||
return { model, requests: 0, avgLatency: 0, successRate: 0, confidence: 0 }
|
|
||||||
}
|
|
||||||
|
|
||||||
const row = results[0] as any
|
|
||||||
return {
|
|
||||||
model,
|
|
||||||
requests: Number(row.requests),
|
|
||||||
avgLatency: Number(row.avg_latency || 0),
|
|
||||||
successRate: Number(row.success_rate || 0),
|
|
||||||
confidence: Number(row.confidence || 0)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
async compareAgentPerformance(): Promise<
|
|
||||||
{ agentId: string; requests: number; avgLatency: number; costUsd: number }[]
|
|
||||||
> {
|
|
||||||
const day = new Date()
|
|
||||||
day.setDate(day.getDate() - 1)
|
|
||||||
const startOfDay = new Date(day)
|
|
||||||
startOfDay.setHours(0, 0, 0, 0)
|
|
||||||
|
|
||||||
const results = await this.db`
|
|
||||||
SELECT
|
|
||||||
agent_id,
|
|
||||||
COUNT(*) as requests,
|
|
||||||
AVG(latency_ms) as avg_latency,
|
|
||||||
(SUM(tokens_in) + SUM(tokens_out)) * 0.0001 as cost_usd
|
|
||||||
FROM agent_request_log
|
|
||||||
WHERE created_at >= ${startOfDay}
|
|
||||||
GROUP BY agent_id
|
|
||||||
ORDER BY requests DESC
|
|
||||||
`
|
|
||||||
|
|
||||||
return results.map((row: any) => ({
|
|
||||||
agentId: row.agent_id,
|
|
||||||
requests: Number(row.requests),
|
|
||||||
avgLatency: Number(row.avg_latency || 0),
|
|
||||||
costUsd: Number(row.cost_usd || 0)
|
|
||||||
}))
|
|
||||||
}
|
|
||||||
|
|
||||||
async getAgentTrending(
|
|
||||||
agentId: string,
|
|
||||||
model: string
|
|
||||||
): Promise<{ trend: 'improving' | 'stable' | 'degrading'; signal: number }> {
|
|
||||||
// Compare success rate: last 24h vs previous 24h
|
|
||||||
const now24h = new Date(Date.now() - 24 * 60 * 60 * 1000)
|
|
||||||
const now48h = new Date(Date.now() - 48 * 60 * 60 * 1000)
|
|
||||||
|
|
||||||
const recent = await this.db`
|
|
||||||
SELECT COUNT(CASE WHEN success = true THEN 1 END)::float / COUNT(*) as rate
|
|
||||||
FROM agent_request_log
|
|
||||||
WHERE agent_id = ${agentId} AND model = ${model} AND created_at > ${now24h}
|
|
||||||
`
|
|
||||||
|
|
||||||
const previous = await this.db`
|
|
||||||
SELECT COUNT(CASE WHEN success = true THEN 1 END)::float / COUNT(*) as rate
|
|
||||||
FROM agent_request_log
|
|
||||||
WHERE agent_id = ${agentId} AND model = ${model}
|
|
||||||
AND created_at > ${now48h} AND created_at <= ${now24h}
|
|
||||||
`
|
|
||||||
|
|
||||||
const recentRate = recent.length > 0 ? Number(recent[0].rate || 0) : 0
|
|
||||||
const previousRate = previous.length > 0 ? Number(previous[0].rate || 0) : 0
|
|
||||||
const signal = recentRate - previousRate
|
|
||||||
|
|
||||||
let trend: 'improving' | 'stable' | 'degrading' = 'stable'
|
|
||||||
if (signal > 0.05) trend = 'improving'
|
|
||||||
if (signal < -0.05) trend = 'degrading'
|
|
||||||
|
|
||||||
return { trend, signal }
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
export default PerAgentMetrics
|
|
||||||
@ -1,12 +0,0 @@
|
|||||||
{
|
|
||||||
"extends": "../../tsconfig.json",
|
|
||||||
"compilerOptions": {
|
|
||||||
"outDir": "./dist",
|
|
||||||
"rootDir": "./src",
|
|
||||||
"declaration": true,
|
|
||||||
"declarationMap": true,
|
|
||||||
"sourceMap": true
|
|
||||||
},
|
|
||||||
"include": ["src/**/*"],
|
|
||||||
"exclude": ["node_modules", "dist", "**/*.test.ts"]
|
|
||||||
}
|
|
||||||
Loading…
x
Reference in New Issue
Block a user