24 changed files with 0 additions and 2762 deletions
--- a/docs/adr/0005-agent-integration-protocol.md
+++ b/docs/adr/0005-agent-integration-protocol.md
@ -1,143 +0,0 @@
 # ADR-0005: Multi-Agent Integration Protocol
 **Date**: 2026-04-19
 **Status**: accepted
 **Deciders**: Rene (Architecture, Phase 2G)
 ## Context
 Phase 2F established the LLM Gateway as a central orchestrator with a TypeScript client SDK. Phase 2G must integrate multiple AI agents:
 - **Claude Code** (Anthropic CLI) — native client SDK (@llm-gateway/client)
 - **Codex/Copilot** (Microsoft) — LSP protocol (Language Server Protocol)
 - **ChatGPT** (OpenAI) — REST API
 - **Ollama** (local inference) — HTTP API (fallback)
 Each agent has different capabilities and communication patterns. We need a unified protocol that:
 1. Abstracts gateway complexity from agents
 2. Supports synchronous and asynchronous operations
 3. Handles streaming responses (code generation, token-by-token)
 4. Manages authentication and rate limiting per agent
 5. Provides graceful fallback when gateway is unavailable
 ## Decision
 Implement a **three-layer agent integration stack**:
 ### Layer 1: Transport (HTTP/WebSocket)
 - **Core**: Fastify endpoints in LLM Gateway
 - **Endpoints**:
  - `POST /agents/{agent-id}/completion` — synchronous completion
  - `GET /agents/{agent-id}/completion?stream=true` — SSE stream
  - `POST /agents/{agent-id}/validate` — prompt validation
  - `GET /agents/{agent-id}/status` — health check
 ### Layer 2: Agent Adapters
 - **Claude Code Adapter**: Node.js module wrapping `@llm-gateway/client`
 - **Codex/Copilot Adapter**: LSP server that forwards requests to gateway HTTP API
 - **ChatGPT Adapter**: REST API wrapper that translates OpenAI format → Gateway format
 - **Ollama Adapter**: HTTP proxy that handles local fallback (already implemented in client SDK)
 ### Layer 3: Protocol Format
 Use JSON-RPC 2.0 over HTTP/WebSocket:
 ```json
 {
  "jsonrpc": "2.0",
  "method": "completion",
  "params": {
    "prompt": "...",
    "model": "claude-3.5-sonnet",
    "temperature": 0.7,
    "max_tokens": 2000,
    "agent_id": "claude-code"
  },
  "id": 1
 }
 ```
 ## Alternatives Considered
 ### Alternative 1: Separate Gateway Instance per Agent
 - **Pros**: Complete isolation, agent-specific customization
 - **Cons**: Operational overhead, duplicate infrastructure, no shared learning
 - **Why not**: Contradicts Phase 2F goal of central orchestration
 ### Alternative 2: Agent-Specific Protocols (No Normalization)
 - **Pros**: Native protocol support for each agent
 - **Cons**: Gateway becomes protocol translator, complexity explosion
 - **Why not**: Gateway becomes a reverse proxy instead of an orchestrator
 ### Alternative 3: Message Queue (RabbitMQ/Kafka)
 - **Pros**: Decouples agents from gateway, supports async workflows
 - **Cons**: Added infrastructure, latency for synchronous operations
 - **Why not**: Overkill for initial integration; add later if async workflows needed
 ## Consequences
 ### Positive
 - **Single integration point**: Agents connect to gateway, not directly to models
 - **Shared learning**: All agents benefit from confidence gating and model selection
 - **Graceful degradation**: Agents fall back to local Ollama independently
 - **Extensible**: New agents added by implementing adapter layer only
 ### Negative
 - **Latency**: Additional HTTP round-trip for each request (vs. direct model call)
 - **Adapter maintenance**: Each agent needs an adapter; breaks if agent API changes
 - **Protocol overhead**: JSON-RPC adds overhead vs. direct integration
 ### Risks
 - **Claude Code integration risk**: Requires subprocess communication with `claude` CLI
 - **Mitigation**: claude-bridge already demonstrates working pattern
 - **Codex integration risk**: Microsoft LSP server not directly compatible with HTTP
 - **Mitigation**: Implement thin LSP-to-HTTP translation layer
 ## Implementation Plan
 ### Phase 2G.1: Claude Code Integration (Week 1)
 ```bash
 # Extend @llm-gateway/client with agent metadata
 createTIPClient({
  agentId: 'claude-code',
  fallback: { ollamaUrl: '192.168.178.213:11434' }
 })
 ```
 ### Phase 2G.2: Codex/Copilot (Week 2)
 ```bash
 # Implement LSP server wrapper
 npm install -D @types/node-lsp-server
 # Create packages/lsp-adapter/
 # - Implements LSP protocol
 # - Translates completion requests to HTTP
 ```
 ### Phase 2G.3: ChatGPT Integration (Week 3)
 ```bash
 # OpenAI API compatibility layer
 # POST /agents/chatgpt/chat/completions
 # → Translate to gateway completion format
 ```
 ### Phase 2G.4: Learning Integration (Week 4)
 ```bash
 # Connect agent-specific metrics to learning engine
 # - Track per-agent accuracy, token usage, latency
 # - Auto-select models per agent based on performance
 ```
 ## Open Questions
 1. **Authentication**: How do agents authenticate with gateway?
   - Option A: API keys per agent
   - Option B: OAuth2 with OIDC
   - Option C: mTLS for local agents, keys for remote
   - **Decision pending**: TBD in Phase 2G.1
 2. **Rate Limiting**: Per-agent or global quota?
   - Option A: Per-agent limits (Claude Code = 100 req/min)
   - Option B: Global pool shared across agents
   - **Decision pending**: Depends on learning system usage patterns
 3. **Response Format**: Streaming vs. buffered?
   - Option A: Always stream (SSE)
   - Option B: Support both (`?stream=true/false`)
   - **Decision pending**: Codex/Copilot compatibility check needed
--- a/docs/adr/README.md
+++ b/docs/adr/README.md
@ -6,4 +6,3 @@
 | [0002](0002-tier-assignment-strategy.md) | Tier Assignment Strategy for Model Selection | accepted | 2026-04-19 |
 | [0003](0003-confidence-gate-thresholds.md) | Confidence Gate Thresholds & Learning Cycle Intervals | accepted | 2026-04-19 |
 | [0004](0004-external-fallback-chain.md) | External Provider Fallback Chain Ordering | accepted | 2026-04-19 |
 | [0005](0005-agent-integration-protocol.md) | Multi-Agent Integration Protocol & Adapters | accepted | 2026-04-19 |
--- a/packages/chatgpt-api-adapter/README.md
+++ b/packages/chatgpt-api-adapter/README.md
@ -1,262 +0,0 @@
 # ChatGPT API Adapter
 OpenAI API compatibility adapter for LLM Gateway. Allows OpenAI client SDKs and curl requests to transparently use LLM Gateway.
 ## Overview
 Provides an HTTP server that implements the OpenAI Chat Completions API specification, transparently routing requests to the LLM Gateway. Existing OpenAI client code requires only a baseURL configuration change.
 ## Installation
 ```bash
 npm install @llm-gateway/chatgpt-api-adapter
 ```
 ## Usage
 ### As a Standalone Server
 ```bash
 # Start the adapter (listens on port 3111)
 npx chatgpt-api
 # Or with custom port
 CHATGPT_API_PORT=8080 npx chatgpt-api
 # Or in Node.js
 import ChatGPTAPIAdapter from '@llm-gateway/chatgpt-api-adapter'
 const adapter = new ChatGPTAPIAdapter(3111)
 await adapter.start()
 ```
 ### With OpenAI Client SDK
 ```typescript
 import OpenAI from 'openai'
 const client = new OpenAI({
  apiKey: 'not-needed',
  baseURL: 'http://localhost:3111/v1'
 })
 const response = await client.chat.completions.create({
  model: 'gpt-4',
  messages: [
    { role: 'user', content: 'Hello, world!' }
  ]
 })
 ```
 ### With curl
 ```bash
 curl http://localhost:3111/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4",
    "messages": [
      {"role": "user", "content": "Explain TypeScript"}
    ],
    "max_tokens": 500
  }'
 ```
 ### Streaming
 ```bash
 curl http://localhost:3111/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4",
    "messages": [
      {"role": "user", "content": "List 5 ideas"}
    ],
    "stream": true
  }'
 ```
 ## Features
 ### Implemented
 - **Chat Completions** (`POST /v1/chat/completions`): Full OpenAI API compatibility
 - **Streaming** (`stream: true`): Server-Sent Events (SSE) with chunked responses
 - **Models** (`GET /v1/models`): Lists available GPT models
 - **Health** (`GET /health`): Gateway health status
 - **Model Mapping**: Automatic mapping from OpenAI to gateway model names
 ### Model Mapping
 | OpenAI Model | Gateway Model |
 |--------------|---------------|
 | gpt-4 | qwen2.5:32b |
 | gpt-4-turbo | qwen2.5:32b |
 | gpt-3.5-turbo | qwen2.5:14b |
 | gpt-4-mini | qwen2.5:3b |
 ## Architecture
 ```
 OpenAI Client
    ↓
 ChatGPT API Adapter (HTTP server)
    ↓
 LLM Gateway API
    ↓
 Model Selection (claude, Ollama, external)
 ```
 ## Environment Variables
 ```bash
 CHATGPT_API_PORT=3111                          # Listen port
 GATEWAY_URL=https://llm-gateway.context-x.org  # LLM Gateway endpoint
 OLLAMA_URL=192.168.178.213:11434              # Local Ollama fallback
 AGENT_ID=chatgpt-api-adapter                   # Agent identifier
 LOG_LEVEL=debug                                # Logging level
 ```
 ## API Endpoints
 ### POST /v1/chat/completions
 Chat completion request using OpenAI format.
 **Request:**
 ```json
 {
  "model": "gpt-4",
  "messages": [
    {"role": "system", "content": "You are helpful..."},
    {"role": "user", "content": "Hello"}
  ],
  "temperature": 0.7,
  "max_tokens": 2000,
  "top_p": 1,
  "stream": false
 }
 ```
 **Response (non-streaming):**
 ```json
 {
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1234567890,
  "model": "gpt-4",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 5,
    "total_tokens": 15
  }
 }
 ```
 **Response (streaming):**
 ```
 data: {"id":"chatcmpl-123","object":"text_completion.chunk","created":1234567890,"model":"gpt-4","choices":[{"index":0,"delta":{"content":"H"},"finish_reason":null}]}
 data: {"id":"chatcmpl-123","object":"text_completion.chunk","created":1234567890,"model":"gpt-4","choices":[{"index":0,"delta":{"content":"ello"},"finish_reason":null}]}
 ...
 data: {"id":"chatcmpl-123","object":"text_completion.chunk","created":1234567890,"model":"gpt-4","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
 data: [DONE]
 ```
 ### GET /v1/models
 List available models.
 **Response:**
 ```json
 {
  "object": "list",
  "data": [
    {"id": "gpt-4", "object": "model", "owned_by": "openai"},
    {"id": "gpt-4-turbo", "object": "model", "owned_by": "openai"},
    {"id": "gpt-3.5-turbo", "object": "model", "owned_by": "openai"},
    {"id": "gpt-4-mini", "object": "model", "owned_by": "openai"}
  ]
 }
 ```
 ### GET /health
 Gateway health status.
 **Response:**
 ```json
 {
  "status": "ok",
  "gateway": {
    "uptime": 123456,
    "models": ["qwen2.5:3b", "qwen2.5:14b"],
    "latency_ms": 250
  }
 }
 ```
 ## Performance
 Typical latencies:
 - **Gateway mode**: 100-500ms (depends on model)
 - **Ollama fallback**: 200-2000ms (depends on hardware)
 - **Streaming chunk**: 10-50ms per chunk
 - **Timeout**: 30s (configurable via gateway)
 ## Testing
 ```bash
 npm test
 ```
 Tests cover:
 - Chat completions (streaming and buffered)
 - Model listing
 - Error handling and fallback behavior
 - Token counting accuracy
 - Message formatting
 - Health checks
 ## Security
 - No API key validation (assumes network-isolated deployment)
 - CORS enabled for all origins (configure as needed)
 - Messages logged at DEBUG level only
 - Automatic cleanup on shutdown (SIGTERM, SIGINT)
 ## Troubleshooting
 ### OpenAI client not connecting
 1. Verify adapter is running: `curl http://localhost:3111/health`
 2. Check baseURL in client: should be `http://localhost:3111/v1` (no `/v1` at end)
 3. Ensure gateway is accessible: `curl $GATEWAY_URL/health`
 ### Streaming not working
 1. Verify `stream: true` in request body
 2. Check for SSE support in client library
 3. Ensure no intermediate proxies are buffering responses
 ### Slow responses
 1. Check gateway latency: `curl -w "%{time_total}\n" $GATEWAY_URL/health`
 2. Verify model availability: `curl http://localhost:3111/v1/models`
 3. Check system resources on gateway (CPU, memory, disk)
 ## Compatibility
 - OpenAI Client SDK (Python, Node.js, Go, etc.)
 - LiteLLM
 - Anthropic Bedrock (proxy mode)
 - Any HTTP client using OpenAI API format
--- a/packages/chatgpt-api-adapter/package.json
+++ b/packages/chatgpt-api-adapter/package.json
@ -1,36 +0,0 @@
 {
  "name": "@llm-gateway/chatgpt-api-adapter",
  "version": "1.0.0",
  "description": "OpenAI API compatibility adapter for LLM Gateway",
  "type": "module",
  "main": "dist/index.js",
  "bin": {
    "chatgpt-api": "dist/cli.js"
  },
  "scripts": {
    "build": "tsc",
    "dev": "tsc --watch",
    "start": "node dist/cli.js",
    "test": "vitest"
  },
  "dependencies": {
    "@llm-gateway/client": "workspace:*",
    "fastify": "^5.3.0",
    "@fastify/cors": "^9.0.0"
  },
  "devDependencies": {
    "@types/node": "^20.0.0",
    "typescript": "^5.0.0",
    "vitest": "^1.0.0"
  },
  "keywords": [
    "openai",
    "api",
    "compatibility",
    "llm",
    "gateway",
    "chatgpt"
  ],
  "license": "MIT",
  "author": "Rene Fichtmueller"
 }
--- a/packages/chatgpt-api-adapter/src/cli.ts
+++ b/packages/chatgpt-api-adapter/src/cli.ts
@ -1,23 +0,0 @@
 #!/usr/bin/env node
 import ChatGPTAPIAdapter from './index'
 const port = parseInt(process.env.CHATGPT_API_PORT || '3111', 10)
 const adapter = new ChatGPTAPIAdapter(port)
 adapter.start().catch(error => {
  console.error('[ChatGPT API] Failed to start:', error)
  process.exit(1)
 })
 process.on('SIGTERM', async () => {
  console.error('[ChatGPT API] SIGTERM received, shutting down...')
  await adapter.stop()
  process.exit(0)
 })
 process.on('SIGINT', async () => {
  console.error('[ChatGPT API] SIGINT received, shutting down...')
  await adapter.stop()
  process.exit(0)
 })
--- a/packages/chatgpt-api-adapter/src/index.test.ts
+++ b/packages/chatgpt-api-adapter/src/index.test.ts
@ -1,166 +0,0 @@
 import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest'
 import ChatGPTAPIAdapter from './index'
 describe('ChatGPTAPIAdapter', () => {
  let adapter: ChatGPTAPIAdapter
  beforeEach(() => {
    adapter = new ChatGPTAPIAdapter(3111)
  })
  afterEach(async () => {
    try {
      await adapter.stop()
    } catch (e) {
      // Ignore cleanup errors
    }
  })
  it('should create adapter instance with default port', () => {
    const a = new ChatGPTAPIAdapter()
    expect(a).toBeDefined()
  })
  it('should create adapter instance with custom port', () => {
    const a = new ChatGPTAPIAdapter(8080)
    expect(a).toBeDefined()
  })
  it('should format messages to prompt correctly', async () => {
    const messages = [
      { role: 'system' as const, content: 'You are helpful' },
      { role: 'user' as const, content: 'Hello' },
      { role: 'assistant' as const, content: 'Hi there' }
    ]
    // Use reflection to access private method for testing
    const formatMessagesToPrompt = (adapter as any).formatMessagesToPrompt.bind(adapter)
    const prompt = formatMessagesToPrompt(messages)
    expect(prompt).toContain('[SYSTEM]')
    expect(prompt).toContain('[USER]')
    expect(prompt).toContain('[ASSISTANT]')
    expect(prompt).toContain('You are helpful')
    expect(prompt).toContain('Hello')
    expect(prompt).toContain('Hi there')
  })
  it('should map OpenAI model names to gateway models', () => {
    const mapModelName = (adapter as any).mapModelName.bind(adapter)
    expect(mapModelName('gpt-4')).toBe('qwen2.5:32b')
    expect(mapModelName('gpt-4-turbo')).toBe('qwen2.5:32b')
    expect(mapModelName('gpt-3.5-turbo')).toBe('qwen2.5:14b')
    expect(mapModelName('gpt-4-mini')).toBe('qwen2.5:3b')
    expect(mapModelName('unknown-model')).toBe('qwen2.5:14b') // Default fallback
  })
  it('should handle missing model gracefully', () => {
    const mapModelName = (adapter as any).mapModelName.bind(adapter)
    expect(mapModelName('custom-model')).toBe('qwen2.5:14b')
  })
  it('should start and stop server', async () => {
    const adaptForTest = new ChatGPTAPIAdapter(3112)
    await adaptForTest.start()
    // Server should be running
    await adaptForTest.stop()
    // Server should be stopped
    expect(true).toBe(true)
  })
  it('should have /v1/models endpoint', async () => {
    // This test is integration-style
    // Would need actual server running and HTTP client
    expect(adapter).toBeDefined()
  })
  it('should format streaming response correctly', () => {
    // Test that streaming response format matches OpenAI spec
    const event = {
      id: 'chatcmpl-123',
      object: 'text_completion.chunk',
      created: 1234567890,
      model: 'gpt-4',
      choices: [
        {
          index: 0,
          delta: { content: 'Hello' },
          finish_reason: null
        }
      ]
    }
    const jsonStr = JSON.stringify(event)
    expect(jsonStr).toContain('chatcmpl-')
    expect(jsonStr).toContain('text_completion.chunk')
    expect(jsonStr).toContain('Hello')
  })
  it('should handle temperature parameter', () => {
    const request = {
      model: 'gpt-4',
      messages: [{ role: 'user' as const, content: 'Hi' }],
      temperature: 0.5
    }
    expect(request.temperature).toBe(0.5)
  })
  it('should handle max_tokens parameter', () => {
    const request = {
      model: 'gpt-4',
      messages: [{ role: 'user' as const, content: 'Hi' }],
      max_tokens: 1000
    }
    expect(request.max_tokens).toBe(1000)
  })
  it('should default to non-streaming mode', () => {
    const request = {
      model: 'gpt-4',
      messages: [{ role: 'user' as const, content: 'Hi' }]
    }
    expect(request as any).not.toHaveProperty('stream')
  })
  it('should handle streaming flag', () => {
    const request = {
      model: 'gpt-4',
      messages: [{ role: 'user' as const, content: 'Hi' }],
      stream: true
    }
    expect(request.stream).toBe(true)
  })
  it('should have proper response structure', () => {
    const response = {
      id: 'chatcmpl-123',
      object: 'chat.completion',
      created: Math.floor(Date.now() / 1000),
      model: 'gpt-4',
      choices: [
        {
          index: 0,
          message: {
            role: 'assistant',
            content: 'Response'
          },
          finish_reason: 'stop'
        }
      ],
      usage: {
        prompt_tokens: 10,
        completion_tokens: 5,
        total_tokens: 15
      }
    }
    expect(response).toHaveProperty('id')
    expect(response).toHaveProperty('object')
    expect(response).toHaveProperty('created')
    expect(response).toHaveProperty('model')
    expect(response).toHaveProperty('choices')
    expect(response).toHaveProperty('usage')
    expect(response.choices[0].message.role).toBe('assistant')
    expect(response.usage.total_tokens).toBe(15)
  })
 })
--- a/packages/chatgpt-api-adapter/src/index.ts
+++ b/packages/chatgpt-api-adapter/src/index.ts
@ -1,234 +0,0 @@
 import Fastify from 'fastify'
 import FastifyCors from '@fastify/cors'
 import { createTIPClient } from '@llm-gateway/client'
 interface ChatMessage {
  role: 'system' | 'user' | 'assistant'
  content: string
 }
 interface ChatCompletionRequest {
  model: string
  messages: ChatMessage[]
  temperature?: number
  max_tokens?: number
  top_p?: number
  stream?: boolean
 }
 interface ChatCompletionResponse {
  id: string
  object: string
  created: number
  model: string
  choices: Array<{
    index: number
    message: {
      role: string
      content: string
    }
    finish_reason: string
  }>
  usage: {
    prompt_tokens: number
    completion_tokens: number
    total_tokens: number
  }
 }
 interface ChatCompletionStreamEvent {
  id: string
  object: string
  created: number
  model: string
  choices: Array<{
    index: number
    delta: {
      content?: string
    }
    finish_reason: string | null
  }>
 }
 export class ChatGPTAPIAdapter {
  private fastify = Fastify()
  private client = createTIPClient({
    agentId: 'chatgpt-api-adapter',
    ollamaUrl: process.env.OLLAMA_URL || '192.168.178.213:11434'
  })
  constructor(private port: number = 3111) {
    this.setupRoutes()
  }
  private formatMessagesToPrompt(messages: ChatMessage[]): string {
    return messages
      .map(msg => `[${msg.role.toUpperCase()}]\n${msg.content}`)
      .join('\n\n')
  }
  private mapModelName(openaiModel: string): string {
    const modelMap: Record<string, string> = {
      'gpt-4': 'qwen2.5:32b',
      'gpt-4-turbo': 'qwen2.5:32b',
      'gpt-3.5-turbo': 'qwen2.5:14b',
      'gpt-4-mini': 'qwen2.5:3b'
    }
    return modelMap[openaiModel] || 'qwen2.5:14b'
  }
  private setupRoutes() {
    this.fastify.register(FastifyCors, {
      origin: '*',
      credentials: true
    })
    this.fastify.get('/v1/models', async () => {
      return {
        object: 'list',
        data: [
          { id: 'gpt-4', object: 'model', owned_by: 'openai' },
          { id: 'gpt-4-turbo', object: 'model', owned_by: 'openai' },
          { id: 'gpt-3.5-turbo', object: 'model', owned_by: 'openai' },
          { id: 'gpt-4-mini', object: 'model', owned_by: 'openai' }
        ]
      }
    })
    this.fastify.post<{ Body: ChatCompletionRequest }>(
      '/v1/chat/completions',
      async (request, reply) => {
        const {
          messages,
          model,
          temperature = 0.7,
          max_tokens = 2000,
          stream = false
        } = request.body
        const prompt = this.formatMessagesToPrompt(messages)
        const mappedModel = this.mapModelName(model)
        if (stream) {
          reply.type('text/event-stream')
          reply.header('Cache-Control', 'no-cache')
          reply.header('Connection', 'keep-alive')
          try {
            const response = await this.client.completion(prompt, {
              model: mappedModel,
              maxTokens: max_tokens,
              temperature
            })
            const createdAt = Math.floor(Date.now() / 1000)
            const chunks = response.text.split('')
            for (const chunk of chunks) {
              const event: ChatCompletionStreamEvent = {
                id: `chatcmpl-${Date.now()}`,
                object: 'text_completion.chunk',
                created: createdAt,
                model,
                choices: [
                  {
                    index: 0,
                    delta: { content: chunk },
                    finish_reason: null
                  }
                ]
              }
              reply.raw.write(`data: ${JSON.stringify(event)}\n\n`)
            }
            const finalEvent: ChatCompletionStreamEvent = {
              id: `chatcmpl-${Date.now()}`,
              object: 'text_completion.chunk',
              created: createdAt,
              model,
              choices: [
                {
                  index: 0,
                  delta: {},
                  finish_reason: 'stop'
                }
              ]
            }
            reply.raw.write(`data: ${JSON.stringify(finalEvent)}\n\n`)
            reply.raw.write('data: [DONE]\n\n')
            reply.raw.end()
          } catch (error) {
            reply.raw.write(
              `data: ${JSON.stringify({ error: 'Completion failed' })}\n\n`
            )
            reply.raw.end()
          }
        } else {
          try {
            const response = await this.client.completion(prompt, {
              model: mappedModel,
              maxTokens: max_tokens,
              temperature
            })
            const result: ChatCompletionResponse = {
              id: `chatcmpl-${Date.now()}`,
              object: 'chat.completion',
              created: Math.floor(Date.now() / 1000),
              model,
              choices: [
                {
                  index: 0,
                  message: {
                    role: 'assistant',
                    content: response.text
                  },
                  finish_reason: 'stop'
                }
              ],
              usage: {
                prompt_tokens: response.tokens.input,
                completion_tokens: response.tokens.output,
                total_tokens: response.tokens.input + response.tokens.output
              }
            }
            return result
          } catch (error) {
            reply.code(500).send({
              error: {
                message: 'Completion request failed',
                type: 'server_error',
                param: null,
                code: 'internal_error'
              }
            })
          }
        }
      }
    )
    this.fastify.get('/health', async () => {
      try {
        const health = await this.client.health()
        return { status: 'ok', gateway: health }
      } catch (error) {
        return { status: 'degraded', error: 'Gateway unavailable' }
      }
    })
  }
  async start() {
    await this.fastify.listen({ port: this.port, host: '0.0.0.0' })
    console.error(`[ChatGPT API] Server listening on port ${this.port}`)
    console.error('[ChatGPT API] OpenAI API compatibility endpoints:')
    console.error('  POST /v1/chat/completions')
    console.error('  GET  /v1/models')
    console.error('  GET  /health')
  }
  async stop() {
    await this.fastify.close()
  }
 }
 export default ChatGPTAPIAdapter
--- a/packages/chatgpt-api-adapter/tsconfig.json
+++ b/packages/chatgpt-api-adapter/tsconfig.json
@ -1,12 +0,0 @@
 {
  "extends": "../../tsconfig.json",
  "compilerOptions": {
    "outDir": "./dist",
    "rootDir": "./src",
    "declaration": true,
    "declarationMap": true,
    "sourceMap": true
  },
  "include": ["src/**/*"],
  "exclude": ["node_modules", "dist", "**/*.test.ts"]
 }
--- a/packages/claude-code-bridge/README.md
+++ b/packages/claude-code-bridge/README.md
@ -1,123 +0,0 @@
 # Claude Code Bridge
 Integration layer between Claude Code IDE and LLM Gateway.
 ## Overview
 Provides a high-level API for Claude Code to leverage the LLM Gateway's multi-model orchestration, confidence gating, and fallback capabilities.
 ## Installation
 ```bash
 npm install @llm-gateway/claude-code-bridge
 ```
 ## Usage
 ```typescript
 import { ClaudeCodeBridge } from '@llm-gateway/claude-code-bridge'
 const bridge = new ClaudeCodeBridge({
  gatewayUrl: 'https://llm-gateway.context-x.org',
  agentId: 'claude-code-ide',
  ideVersion: '1.0.0',
  extensionVersion: '1.0.0',
  ollamaUrl: '192.168.178.213:11434' // Local fallback
 })
 // Explain selected code
 const explanation = await bridge.explain(context, selectedCode)
 // Refactor code
 const refactored = await bridge.refactor(context, selectedCode)
 // Generate tests
 const tests = await bridge.test(context, selectedCode)
 // Add documentation
 const docs = await bridge.document(context, selectedCode)
 // Fix errors
 const fix = await bridge.fixError(errorMessage, context)
 // Check health
 const status = await bridge.health()
 ```
 ## Features
 - **Code Explanation**: Analyze and explain code snippets
 - **Refactoring**: Suggest improvements to existing code
 - **Test Generation**: Automatically generate test cases
 - **Documentation**: Create JSDoc/TSDoc comments
 - **Error Fixing**: Debug and fix code errors
 - **Fallback**: Automatic fallback to local Ollama when gateway unavailable
 - **Confidence Tracking**: Monitor model confidence in responses
 - **Token Counting**: Track usage for billing/analytics
 ## Architecture
 The bridge implements the three-layer agent integration stack from ADR-0005:
 1. **Transport Layer**: HTTP/WebSocket communication with gateway
 2. **Adapter Layer**: ClaudeCodeBridge wraps client SDK
 3. **Protocol Layer**: Standardized request/response format
 ## Health Status
 ```typescript
 const health = await bridge.health()
 // {
 //   healthy: true,
 //   gateway: true,
 //   ollama: 'running',
 //   mode: 'gateway'
 // }
 ```
 Modes:
 - `gateway`: Using LLM Gateway (preferred)
 - `fallback`: Using local Ollama (gateway unavailable)
 - `offline`: Both gateway and Ollama offline (error)
 ## Configuration
 ```typescript
 interface ClaudeCodeBridgeConfig {
  gatewayUrl: string                 // LLM Gateway endpoint
  agentId: string                    // Agent identifier (default: 'claude-code-ide')
  ideVersion: string                 // Claude Code version
  extensionVersion: string           // Bridge extension version
  ollamaUrl?: string                 // Local Ollama URL (optional)
  apiKey?: string                    // Gateway API key (if required)
  requestTimeout?: number            // Request timeout in ms (default: 30000)
 }
 ```
 ## Response Format
 ```typescript
 interface ClaudeCodeResponse {
  text: string           // Generated response
  tokens: {
    input: number        // Input tokens
    output: number       // Output tokens
  }
  model: string          // Model used
  fallback: boolean      // Whether using fallback
  confidence: number     // 0-1 confidence score
 }
 ```
 ## Testing
 ```bash
 npm test
 ```
 Tests cover:
 - Health checks
 - All completion methods (explain, refactor, test, document, fix)
 - Fallback behavior
 - Token limiting
 - Metadata tracking
--- a/packages/claude-code-bridge/package.json
+++ b/packages/claude-code-bridge/package.json
@ -1,31 +0,0 @@
 {
  "name": "@llm-gateway/claude-code-bridge",
  "version": "1.0.0",
  "description": "Claude Code IDE integration with LLM Gateway",
  "type": "module",
  "main": "dist/index.js",
  "types": "dist/index.d.ts",
  "scripts": {
    "build": "tsc",
    "dev": "tsc --watch",
    "test": "vitest"
  },
  "dependencies": {
    "@llm-gateway/client": "workspace:*",
    "@anthropic-sdk/sdk": "^1.0.0"
  },
  "devDependencies": {
    "@types/node": "^20.0.0",
    "typescript": "^5.0.0",
    "vitest": "^1.0.0"
  },
  "keywords": [
    "claude",
    "code",
    "llm",
    "gateway",
    "ide"
  ],
  "license": "MIT",
  "author": "Rene Fichtmueller"
 }
--- a/packages/claude-code-bridge/src/index.test.ts
+++ b/packages/claude-code-bridge/src/index.test.ts
@ -1,120 +0,0 @@
 import { describe, it, expect, beforeEach, vi } from 'vitest'
 import { ClaudeCodeBridge } from './index'
 describe('ClaudeCodeBridge', () => {
  let bridge: ClaudeCodeBridge
  beforeEach(() => {
    bridge = new ClaudeCodeBridge({
      gatewayUrl: 'http://localhost:3000',
      agentId: 'claude-code-test',
      ideVersion: '1.0.0',
      extensionVersion: '1.0.0'
    })
  })
  describe('health check', () => {
    it('should report health status', async () => {
      const health = await bridge.health()
      expect(health).toHaveProperty('healthy')
      expect(health).toHaveProperty('gateway')
      expect(health).toHaveProperty('ollama')
      expect(health).toHaveProperty('mode')
    })
    it('should handle gateway unavailable gracefully', async () => {
      const health = await bridge.health()
      // Should have fallback mode or offline
      expect(health.mode).toMatch(/fallback|offline|gateway/)
    })
  })
  describe('completion methods', () => {
    it('should support explain command', async () => {
      const context = 'function add(a, b) { return a + b; }'
      const selection = 'return a + b'
      const response = await bridge.explain(context, selection)
      expect(response).toHaveProperty('text')
      expect(response).toHaveProperty('tokens')
      expect(response).toHaveProperty('model')
      expect(response).toHaveProperty('fallback')
      expect(response).toHaveProperty('confidence')
    })
    it('should support refactor command', async () => {
      const context = 'for(let i=0;i<arr.length;i++){}'
      const selection = 'for(let i=0;i<arr.length;i++)'
      const response = await bridge.refactor(context, selection)
      expect(response.text).toBeTruthy()
      expect(typeof response.tokens.input).toBe('number')
      expect(typeof response.tokens.output).toBe('number')
    })
    it('should support test command', async () => {
      const context = 'export function multiply(a, b) { return a * b }'
      const selection = 'export function multiply(a, b)'
      const response = await bridge.test(context, selection)
      expect(response.text).toBeTruthy()
      expect(response.model).toBeTruthy()
    })
    it('should support document command', async () => {
      const context = 'const config = { timeout: 5000 }'
      const selection = '{ timeout: 5000 }'
      const response = await bridge.document(context, selection)
      expect(response.text).toBeTruthy()
    })
    it('should support fix command', async () => {
      const error = 'ReferenceError: x is not defined'
      const context = 'function test() { console.log(x); }'
      const response = await bridge.fixError(error, context)
      expect(response.text).toBeTruthy()
    })
  })
  describe('generic completion', () => {
    it('should handle custom prompts', async () => {
      const response = await bridge.completion('custom', 'Write a hello world function')
      expect(response).toHaveProperty('text')
      expect(response).toHaveProperty('tokens')
      expect(response).toHaveProperty('model')
    })
    it('should respect maxTokens limit', async () => {
      const response = await bridge.completion('test', 'Short prompt', 100)
      expect(response.tokens.output).toBeLessThanOrEqual(150) // Small margin for tokenizer variance
    })
  })
  describe('fallback behavior', () => {
    it('should report when using fallback', async () => {
      const response = await bridge.completion('test', 'Test prompt')
      expect(response).toHaveProperty('fallback')
      expect(typeof response.fallback).toBe('boolean')
    })
    it('should still work during fallback to Ollama', async () => {
      const response = await bridge.completion('test', 'Generate a simple greeting')
      expect(response.text).toBeTruthy()
      expect(response.tokens).toBeTruthy()
    })
  })
  describe('metadata tracking', () => {
    it('should track IDE version', async () => {
      const status = await bridge.status()
      expect(status).toBeDefined()
    })
    it('should identify agent as claude-code', async () => {
      const response = await bridge.completion('test', 'Simple test')
      expect(response.model).toBeTruthy()
    })
  })
 })
--- a/packages/claude-code-bridge/src/index.ts
+++ b/packages/claude-code-bridge/src/index.ts
@ -1,108 +0,0 @@
 import { createTIPClient, type TIPClientConfig } from '@llm-gateway/client'
 export interface ClaudeCodeBridgeConfig extends TIPClientConfig {
  agentId: string
  ideVersion: string
  extensionVersion: string
 }
 export interface ClaudeCodeRequest {
  command: string
  context: string
  selection?: string
  temperature?: number
  maxTokens?: number
 }
 export interface ClaudeCodeResponse {
  text: string
  tokens: { input: number; output: number }
  model: string
  fallback: boolean
  confidence: number
 }
 export class ClaudeCodeBridge {
  private client: ReturnType<typeof createTIPClient>
  private config: ClaudeCodeBridgeConfig
  constructor(config: ClaudeCodeBridgeConfig) {
    this.config = {
      agentId: 'claude-code-ide',
      ideVersion: config.ideVersion,
      extensionVersion: config.extensionVersion,
      ...config
    }
    this.client = createTIPClient(this.config)
  }
  async explain(context: string, selection: string): Promise<ClaudeCodeResponse> {
    const prompt = `Explain the following code in detail:\n\n\`\`\`\n${selection}\n\`\`\`\n\nContext:\n${context}`
    return this.completion('explain', prompt)
  }
  async refactor(context: string, selection: string): Promise<ClaudeCodeResponse> {
    const prompt = `Refactor the following code to improve readability, performance, and maintainability:\n\n\`\`\`\n${selection}\n\`\`\`\n\nContext:\n${context}`
    return this.completion('refactor', prompt)
  }
  async test(context: string, selection: string): Promise<ClaudeCodeResponse> {
    const prompt = `Write comprehensive tests for the following code:\n\n\`\`\`\n${selection}\n\`\`\`\n\nContext:\n${context}`
    return this.completion('test', prompt)
  }
  async document(context: string, selection: string): Promise<ClaudeCodeResponse> {
    const prompt = `Write JSDoc/TSDoc documentation for the following code:\n\n\`\`\`\n${selection}\n\`\`\`\n\nContext:\n${context}`
    return this.completion('document', prompt)
  }
  async fixError(errorMessage: string, context: string): Promise<ClaudeCodeResponse> {
    const prompt = `Fix the following error:\n${errorMessage}\n\nContext:\n${context}`
    return this.completion('fix', prompt)
  }
  async completion(command: string, prompt: string, maxTokens = 2000): Promise<ClaudeCodeResponse> {
    const result = await this.client.completion(prompt, {
      maxTokens,
      metadata: {
        source: 'claude-code-ide',
        command,
        version: this.config.ideVersion
      }
    })
    return {
      text: result.text,
      tokens: result.tokens,
      model: result.model,
      fallback: result.fallback,
      confidence: result.confidence ?? 0
    }
  }
  async status() {
    return this.client.getStatus()
  }
  async health() {
    try {
      const status = await this.status()
      return {
        healthy: status.gateway === true || status.ollama !== 'offline',
        gateway: status.gateway,
        ollama: status.ollama,
        mode: status.mode
      }
    } catch (error) {
      return {
        healthy: false,
        gateway: false,
        ollama: 'offline' as const,
        mode: 'offline' as const,
        error: String(error)
      }
    }
  }
 }
 export default ClaudeCodeBridge
--- a/packages/claude-code-bridge/tsconfig.json
+++ b/packages/claude-code-bridge/tsconfig.json
@ -1,12 +0,0 @@
 {
  "extends": "../../tsconfig.json",
  "compilerOptions": {
    "outDir": "./dist",
    "rootDir": "./src",
    "declaration": true,
    "declarationMap": true,
    "sourceMap": true
  },
  "include": ["src/**/*"],
  "exclude": ["node_modules", "dist", "**/*.test.ts"]
 }
--- a/packages/codex-lsp-adapter/README.md
+++ b/packages/codex-lsp-adapter/README.md
@ -1,162 +0,0 @@
 # Codex LSP Adapter
 Language Server Protocol adapter for GitHub Copilot/Microsoft Codex integration with LLM Gateway.
 ## Overview
 Implements the Language Server Protocol (LSP) to allow Codex and Copilot plugins to connect to the LLM Gateway. Bridges the gap between LSP's structured protocol and the gateway's completion API.
 ## Installation
 ```bash
 npm install @llm-gateway/codex-lsp-adapter
 ```
 ## Usage
 ### As a Language Server
 ```bash
 # Start the LSP server (listens on stdio)
 npx codex-lsp
 # Or in Node.js
 import CodexLSPAdapter from '@llm-gateway/codex-lsp-adapter'
 const adapter = new CodexLSPAdapter()
 adapter.start()
 ```
 ### VS Code Extension Configuration
 ```json
 {
  "languageServerHangingPercent": 0,
  "languageServers": {
    "codex": {
      "command": "codex-lsp",
      "args": [],
      "languages": [
        "javascript",
        "typescript",
        "python",
        "go",
        "rust"
      ]
    }
  }
 }
 ```
 ## Features
 ### Implemented
 - **Completions** (`textDocument/completion`): Code completion triggered by `.`, space, or `(`
 - **Hover** (`textDocument/hover`): Hover documentation with code explanation
 - **Text Sync**: Full document synchronization
 - **Execute Commands**: `codex.explain`, `codex.refactor`, `codex.test`, `codex.fix`
 ### Architecture
 The adapter translates LSP requests to gateway completions:
 ```
 LSP Client (Copilot/IDE)
    ↓
 CodexLSPAdapter (stdio bridge)
    ↓
 LLM Gateway API
    ↓
 Model Selection (claude, Ollama, external)
 ```
 ## Environment Variables
 ```bash
 GATEWAY_URL=https://llm-gateway.context-x.org  # LLM Gateway endpoint
 OLLAMA_URL=192.168.178.213:11434              # Local Ollama fallback
 AGENT_ID=codex-lsp-server                      # Agent identifier
 LOG_LEVEL=debug                                # Logging level
 ```
 ## Protocol Details
 ### Supported Capabilities
 ```typescript
 {
  textDocumentSync: 1,                          // Full document sync
  completionProvider: {
    resolveProvider: true,
    triggerCharacters: ['.', ' ', '(']
  },
  hoverProvider: true,
  definitionProvider: true,
  codeActionProvider: true,
  executeCommandProvider: {
    commands: [
      'codex.explain',
      'codex.refactor',
      'codex.test',
      'codex.fix'
    ]
  }
 }
 ```
 ### Response Format
 Completion items include:
 - **label**: First line of completion
 - **insertText**: Full completion text
 - **documentation**: Model name and confidence
 - **detail**: Source (Gateway vs Ollama fallback)
 - **kind**: CompletionItemKind.Snippet
 ## Testing
 ```bash
 npm test
 ```
 Tests cover:
 - LSP initialization and shutdown
 - Completion requests with various triggers
 - Hover information extraction
 - Error handling and fallback behavior
 - Confidence score reporting
 ## Troubleshooting
 ### Server not connecting
 1. Check if LSP server is running: `lsof -i :protocol`
 2. Verify gateway is accessible: `curl https://llm-gateway.context-x.org/health`
 3. Check logs: `LOG_LEVEL=debug codex-lsp`
 ### Slow completions
 1. Reduce `maxTokens` in completion requests
 2. Check gateway latency: `curl -w "@curl-format.txt" https://llm-gateway.context-x.org/health`
 3. Verify Ollama is running if using fallback
 ### Poor suggestion quality
 1. Adjust temperature/top_p in gateway requests
 2. Check model selection (may be using fallback)
 3. Provide more context in completion requests
 ## Performance
 Typical latencies:
 - **Gateway mode**: 100-500ms (depends on model)
 - **Ollama fallback**: 200-2000ms (depends on hardware)
 - **Timeout**: 30s (configurable)
 ## Security
 - LSP communicates over stdio (no network exposure)
 - Gateway API calls use configured authentication
 - Ollama fallback is local-only by default
 - No credentials stored in LSP adapter
--- a/packages/codex-lsp-adapter/package.json
+++ b/packages/codex-lsp-adapter/package.json
@ -1,37 +0,0 @@
 {
  "name": "@llm-gateway/codex-lsp-adapter",
  "version": "1.0.0",
  "description": "Language Server Protocol adapter for Codex/Copilot integration with LLM Gateway",
  "type": "module",
  "main": "dist/index.js",
  "bin": {
    "codex-lsp": "dist/cli.js"
  },
  "scripts": {
    "build": "tsc",
    "dev": "tsc --watch",
    "start": "node dist/cli.js",
    "test": "vitest"
  },
  "dependencies": {
    "@llm-gateway/client": "workspace:*",
    "vscode-jsonrpc": "^8.0.0",
    "vscode-languageserver": "^9.0.0",
    "vscode-languageserver-protocol": "^3.17.0"
  },
  "devDependencies": {
    "@types/node": "^20.0.0",
    "typescript": "^5.0.0",
    "vitest": "^1.0.0"
  },
  "keywords": [
    "lsp",
    "language-server",
    "copilot",
    "codex",
    "llm",
    "gateway"
  ],
  "license": "MIT",
  "author": "Rene Fichtmueller"
 }
--- a/packages/codex-lsp-adapter/src/cli.ts
+++ b/packages/codex-lsp-adapter/src/cli.ts
@ -1,13 +0,0 @@
 #!/usr/bin/env node
 import CodexLSPAdapter from './index'
 const adapter = new CodexLSPAdapter()
 // Start the LSP server
 adapter.start()
 // Log startup
 console.error('[Codex LSP] Server started on stdio')
 console.error(`[Codex LSP] Gateway URL: ${process.env.GATEWAY_URL || 'default'}`)
 console.error(`[Codex LSP] Ollama URL: ${process.env.OLLAMA_URL || '192.168.178.213:11434'}`)
--- a/packages/codex-lsp-adapter/src/index.ts
+++ b/packages/codex-lsp-adapter/src/index.ts
@ -1,130 +0,0 @@
 import { createTIPClient } from '@llm-gateway/client'
 import {
  createConnection,
  TextDocuments,
  Diagnostic,
  DiagnosticSeverity,
  InitializeResult,
  ServerCapabilities,
  Position,
  Range,
  CompletionItem,
  CompletionItemKind,
  MarkupKind
 } from 'vscode-languageserver'
 import { TextDocument } from 'vscode-languageserver-textdocument'
 export class CodexLSPAdapter {
  private connection = createConnection()
  private documents = new TextDocuments(TextDocument)
  private client = createTIPClient({
    agentId: 'codex-lsp-server',
    ollamaUrl: process.env.OLLAMA_URL || '192.168.178.213:11434'
  })
  constructor() {
    this.setupHandlers()
  }
  private setupHandlers() {
    this.connection.onInitialize(this.handleInitialize.bind(this))
    this.connection.onCompletion(this.handleCompletion.bind(this))
    this.connection.onHover(this.handleHover.bind(this))
    this.connection.onDefinition(this.handleDefinition.bind(this))
    this.documents.onDidChangeContent(this.handleDocumentChange.bind(this))
    this.documents.listen(this.connection)
  }
  private handleInitialize() {
    const capabilities: ServerCapabilities = {
      textDocumentSync: 1,
      completionProvider: {
        resolveProvider: true,
        triggerCharacters: ['.', ' ', '(']
      },
      hoverProvider: true,
      definitionProvider: true,
      codeActionProvider: true,
      executeCommandProvider: {
        commands: ['codex.explain', 'codex.refactor', 'codex.test', 'codex.fix']
      }
    }
    const result: InitializeResult = { capabilities }
    return result
  }
  private async handleCompletion(params: any) {
    const doc = this.documents.get(params.textDocument.uri)
    if (!doc) return []
    const position = params.position
    const text = doc.getText()
    const offset = doc.offsetAt(position)
    try {
      const response = await this.client.completion(
        `Complete the following code:\n\n${text}\n\n[cursor here]`,
        { maxTokens: 500 }
      )
      return [
        {
          label: response.text.split('\n')[0],
          kind: CompletionItemKind.Snippet,
          documentation: {
            kind: MarkupKind.Markdown,
            value: `**Model**: ${response.model}\n**Confidence**: ${(response.confidence * 100).toFixed(1)}%`
          },
          insertText: response.text,
          detail: response.fallback ? '(Ollama fallback)' : '(Gateway)'
        } as CompletionItem
      ]
    } catch (error) {
      return []
    }
  }
  private async handleHover(params: any) {
    const doc = this.documents.get(params.textDocument.uri)
    if (!doc) return null
    const selectedText = doc.getText({
      start: { line: params.position.line, character: 0 },
      end: { line: params.position.line + 1, character: 0 }
    })
    try {
      const response = await this.client.completion(
        `Briefly explain this code:\n${selectedText}`,
        { maxTokens: 200 }
      )
      return {
        contents: {
          kind: MarkupKind.Markdown,
          value: `${response.text}\n\n*${response.model} (${(response.confidence * 100).toFixed(0)}%)*`
        }
      }
    } catch (error) {
      return null
    }
  }
  private async handleDefinition(params: any) {
    // Definition lookup would be more complex in real implementation
    // For now, return null - could integrate with symbol indexing
    return null
  }
  private async handleDocumentChange(change: any) {
    const doc = change.document
    // Could perform diagnostics here on significant changes
  }
  start() {
    this.connection.listen()
  }
 }
 export default CodexLSPAdapter
--- a/packages/codex-lsp-adapter/tsconfig.json
+++ b/packages/codex-lsp-adapter/tsconfig.json
@ -1,12 +0,0 @@
 {
  "extends": "../../tsconfig.json",
  "compilerOptions": {
    "outDir": "./dist",
    "rootDir": "./src",
    "declaration": true,
    "declarationMap": true,
    "sourceMap": true
  },
  "include": ["src/**/*"],
  "exclude": ["node_modules", "dist", "**/*.test.ts"]
 }
--- a/packages/learning-integration/README.md
+++ b/packages/learning-integration/README.md
@ -1,358 +0,0 @@
 # Learning System Integration
 Per-agent metrics collection, feedback processing, and learning system integration for LLM Gateway.
 ## Overview
 Extends the global learning system (Phase 2D) with per-agent signal isolation. Tracks metrics separately for each agent (Claude Code, Codex, ChatGPT, etc.) to enable agent-specific optimization and cost attribution.
 ## Installation
 ```bash
 npm install @llm-gateway/learning-integration
 ```
 ## Core Concepts
 ### Per-Agent Metrics
 Each agent maintains its own metric set tracking success, latency, cost, and confidence:
 - **Success Rate**: % of requests that succeeded without fallback
 - **Latency**: P50, P95, P99 response time (ms)
 - **Cost**: Token consumption × model cost
 - **Confidence**: Learned score 0-1 indicating model suitability for agent
 ### Feedback Loop
 Agents report outcomes (success, fallback, error, timeout) enabling closed-loop learning:
 - Adapter automatically tracks success/fallback
 - Client can provide explicit feedback (quality, satisfaction)
 - Learning engine uses feedback to update per-agent confidence scores
 ### Confidence Scoring
 Per-agent confidence (independent of global score):
 - Initialized from global baseline
 - Updated hourly based on feedback
 - Influences routing decisions (per-agent gate overrides global gate)
 - Decays 10% per day if inactive
 ## Usage
 ### Basic Setup
 ```typescript
 import { LearningIntegration } from '@llm-gateway/learning-integration'
 import postgres from 'postgres'
 const db = postgres({
  host: 'localhost',
  port: 5432,
  database: 'llm_gateway'
 })
 const learning = new LearningIntegration(db)
 // Initialize tables on startup
 await learning.initializeTables()
 ```
 ### Logging Requests
 ```typescript
 import { randomUUID } from 'crypto'
 const requestId = randomUUID()
 // After completion, log the request
 await learning.logRequest({
  requestId,
  agentId: 'claude-code',
  model: 'qwen2.5:14b',
  latencyMs: 250,
  tokensIn: 150,
  tokensOut: 450,
  confidence: 0.85,
  fallbackUsed: false,
  success: true
 })
 ```
 ### Recording Feedback
 ```typescript
 // Automatic (adapter tracks outcome)
 await learning.recordFeedback({
  requestId,
  agentId: 'claude-code',
  outcome: 'success',
  completionQuality: 8, // 0-10
  latencyMs: 250
 })
 // Explicit (from client UI)
 await learning.recordFeedback({
  requestId,
  agentId: 'chatgpt',
  outcome: 'success',
  metadata: {
    userSatisfaction: 9 // 0-10 from thumbs up/down
  }
 })
 ```
 ### Computing Metrics
 ```typescript
 // Per-agent metrics (last 24h)
 const metrics = await learning.getAgentMetrics('claude-code')
 console.log(metrics)
 // [{
 //   agentId: 'claude-code',
 //   model: 'qwen2.5:14b',
 //   requestCount: 1523,
 //   successRate: 0.98,
 //   avgLatencyMs: 245,
 //   totalTokens: 850000,
 //   costUsd: 85.00,
 //   confidence: 0.87,
 //   updatedAt: 2026-04-19T22:00:00Z
 // }]
 // Per-agent cost tracking
 const costs = await learning.getAgentCosts(30) // 30 days
 costs.forEach((cost, agentId) => {
  console.log(`${agentId}: $${cost.toFixed(2)}`)
 })
 // claude-code: $892.50
 // chatgpt: $1234.75
 // codex: $345.20
 // Anomaly detection
 const anomalies = await learning.detectAnomalies('claude-code')
 anomalies.forEach(a => {
  console.log(`${a.model}: ${a.issue}`)
 })
 ```
 ### SLO Monitoring
 ```typescript
 import { PerAgentMetrics } from '@llm-gateway/learning-integration/metrics'
 const metrics = new PerAgentMetrics(db)
 // Check latency SLO
 const slo = await metrics.checkLatencySLO('claude-code', 100) // Target: 100ms
 console.log(slo)
 // {
 //   agentId: 'claude-code',
 //   targetMs: 100,
 //   p50: 45,
 //   p95: 89,
 //   p99: 98,
 //   breached: false
 // }
 // Daily cost report
 const costs = await metrics.generateDailyCostReport('2026-04-19')
 console.log(costs)
 // [{
 //   date: '2026-04-19',
 //   agentId: 'claude-code',
 //   tokensIn: 50000,
 //   tokensOut: 150000,
 //   costUsd: 20.00
 // }]
 ```
 ### Feedback Processing
 ```typescript
 import { FeedbackProcessor } from '@llm-gateway/learning-integration/feedback'
 const feedback = new FeedbackProcessor(db)
 // Process feedback from any source
 await feedback.processFeedback({
  requestId,
  agentId: 'chatgpt',
  outcome: 'success',
  completionQuality: 9,
  userSatisfaction: 10
 })
 // Get feedback stats
 const stats = await feedback.getFeedbackStats('chatgpt')
 console.log(stats)
 // {
 //   agentId: 'chatgpt',
 //   totalFeedback: 2450,
 //   outcomeBreakdown: {
 //     success: 2350,
 //     fallback: 50,
 //     timeout: 25,
 //     error: 20,
 //     user_rejected: 5
 //   },
 //   avgQuality: 8.2,
 //   avgSatisfaction: 8.7
 // }
 // Compute confidence score from feedback
 const score = await feedback.computeConfidenceScore('chatgpt', 'gpt-4')
 console.log(`Confidence: ${score.toFixed(2)}`) // 0.87
 ```
 ## Database Schema
 ### agent_request_log
 ```sql
 CREATE TABLE agent_request_log (
  request_id UUID PRIMARY KEY,
  agent_id VARCHAR(64) NOT NULL,
  model VARCHAR(128) NOT NULL,
  latency_ms INTEGER NOT NULL,
  tokens_in INTEGER NOT NULL,
  tokens_out INTEGER NOT NULL,
  confidence DECIMAL(3, 2) NOT NULL,
  fallback_used BOOLEAN NOT NULL DEFAULT FALSE,
  success BOOLEAN NOT NULL DEFAULT TRUE,
  created_at TIMESTAMP NOT NULL DEFAULT NOW(),
  INDEX idx_agent_model (agent_id, model),
  INDEX idx_created (created_at)
 )
 ```
 ### agent_feedback
 ```sql
 CREATE TABLE agent_feedback (
  id SERIAL PRIMARY KEY,
  request_id UUID NOT NULL,
  agent_id VARCHAR(64) NOT NULL,
  outcome VARCHAR(32) NOT NULL,
  completion_quality SMALLINT,
  latency_ms INTEGER,
  token_count INTEGER,
  metadata JSONB,
  created_at TIMESTAMP NOT NULL DEFAULT NOW(),
  FOREIGN KEY (request_id) REFERENCES agent_request_log (request_id),
  INDEX idx_agent_outcome (agent_id, outcome),
  INDEX idx_created (created_at)
 )
 ```
 ### agent_confidence_scores
 ```sql
 CREATE TABLE agent_confidence_scores (
  id SERIAL PRIMARY KEY,
  agent_id VARCHAR(64) NOT NULL,
  model VARCHAR(128) NOT NULL,
  score DECIMAL(3, 2) NOT NULL,
  sample_size INTEGER NOT NULL DEFAULT 0,
  trend VARCHAR(16) NOT NULL DEFAULT 'stable',
  updated_at TIMESTAMP NOT NULL DEFAULT NOW(),
  UNIQUE (agent_id, model),
  INDEX idx_agent (agent_id)
 )
 ```
 ## Integration with Learning Engine
 ### Learning Cycle (ADR-0003)
 Per-agent metrics computed during learning cycles:
 **Phase 2**: Aggregate global metrics (existing)
 **Phase 2**: Compute per-agent slices (new)
 ```typescript
 for (const agentId of knownAgents) {
  const metrics = await learning.getAgentMetrics(agentId)
  for (const metric of metrics) {
    // Update per-agent confidence
    const newScore = feedback.computeConfidenceScore(agentId, metric.model)
    await learning.updateAgentConfidence(agentId, metric.model, newScore)
  }
 }
 ```
 **Phase 3**: Update per-agent confidence scores (new)
 ```typescript
 for (const [agentId, model] of agentModelPairs) {
  const score = await feedback.computeConfidenceScore(agentId, model)
  const shouldUpdate = await feedback.shouldUpdateConfidence(agentId, model, score)
  if (shouldUpdate) {
    await learning.updateAgentConfidence(agentId, model, score)
  }
 }
 ```
 **Phase 5**: A/B test per-agent routing (new)
 ```typescript
 // 10% of traffic uses per-agent routing
 if (Math.random() < 0.1) {
  const agentConfidence = await learning.getAgentConfidence(agentId, model)
  if (agentConfidence && agentConfidence.score > 0.65) {
    // Use per-agent routing decision
  }
 }
 ```
 ## Feedback Outcomes
 | Outcome | Meaning | Auto | Manual |
 |---------|---------|------|--------|
 | `success` | Request succeeded, no fallback | Yes | Yes |
 | `fallback` | Gateway unavailable, used Ollama | Yes | - |
 | `timeout` | Request exceeded timeout | Yes | - |
 | `error` | Request failed with error | Yes | Yes |
 | `user_rejected` | Client explicitly rejected response | - | Yes |
 ## Cost Attribution
 Monthly cost per agent (token-based):
 ```
 Cost = (tokens_in + tokens_out) × model_rate × 0.0001
 ```
 Default rates:
 - qwen2.5:3b = $0.0001 per 1K tokens
 - qwen2.5:14b = $0.0001 per 1K tokens
 - qwen2.5:32b = $0.0001 per 1K tokens
 Configurable via learning engine cost config.
 ## Testing
 ```bash
 npm test
 ```
 Tests cover:
 - Per-agent metric computation
 - Feedback ingestion and processing
 - Confidence score calculation
 - Anomaly detection
 - Cost attribution
 - SLO monitoring
 - Trending analysis
 ## Performance
 - Request logging: <1ms per insertion
 - Feedback processing: <1ms per insertion
 - Metric computation (24h): 100-500ms per agent
 - Cost report generation: 500ms-1s for all agents
 - Anomaly detection: 1-2s per agent
 ## Related ADRs
 - [ADR-0002](../adr/0002-tier-assignment-strategy.md) — Tier assignment (per-agent override)
 - [ADR-0003](../adr/0003-confidence-gate-thresholds.md) — Confidence gate (per-agent gate)
 - [ADR-0006](../adr/0006-learning-system-integration.md) — Learning system specification
 ## Security Notes
 - Agent IDs are stored plaintext (consider hashing for privacy-sensitive deployments)
 - User satisfaction scores in metadata (consider encryption at rest)
 - Cost reports are per-agent (may expose usage patterns)
--- a/packages/learning-integration/package.json
+++ b/packages/learning-integration/package.json
@ -1,37 +0,0 @@
 {
  "name": "@llm-gateway/learning-integration",
  "version": "1.0.0",
  "description": "Per-agent learning metrics and feedback integration for LLM Gateway",
  "type": "module",
  "main": "dist/index.js",
  "exports": {
    ".": "./dist/index.js",
    "./metrics": "./dist/metrics.js",
    "./feedback": "./dist/feedback.js"
  },
  "scripts": {
    "build": "tsc",
    "dev": "tsc --watch",
    "test": "vitest"
  },
  "dependencies": {
    "@llm-gateway/client": "workspace:*",
    "@llm-gateway/learning": "workspace:*",
    "postgres": "^3.0.0"
  },
  "devDependencies": {
    "@types/node": "^20.0.0",
    "typescript": "^5.0.0",
    "vitest": "^1.0.0"
  },
  "keywords": [
    "learning",
    "metrics",
    "feedback",
    "per-agent",
    "llm",
    "gateway"
  ],
  "license": "MIT",
  "author": "Rene Fichtmueller"
 }
--- a/packages/learning-integration/src/feedback.ts
+++ b/packages/learning-integration/src/feedback.ts
@ -1,215 +0,0 @@
 import { Client } from 'postgres'
 export type FeedbackOutcome = 'success' | 'fallback' | 'timeout' | 'error' | 'user_rejected'
 export interface FeedbackRequest {
  requestId: string
  agentId: string
  outcome: FeedbackOutcome
  completionQuality?: number // 0-10
  latencyMs?: number
  tokenCount?: number
  userSatisfaction?: number // 0-10 from UI
  metadata?: Record<string, unknown>
 }
 export interface FeedbackStats {
  agentId: string
  totalFeedback: number
  outcomeBreakdown: Record<FeedbackOutcome, number>
  avgQuality: number
  avgSatisfaction: number
 }
 export class FeedbackProcessor {
  constructor(private db: Client) {}
  async processFeedback(feedback: FeedbackRequest): Promise<void> {
    const timestamp = new Date()
    await this.db`
      INSERT INTO agent_feedback (
        request_id, agent_id, outcome, completion_quality, latency_ms,
        token_count, metadata, created_at
      ) VALUES (
        ${feedback.requestId},
        ${feedback.agentId},
        ${feedback.outcome},
        ${feedback.completionQuality || null},
        ${feedback.latencyMs || null},
        ${feedback.tokenCount || null},
        ${JSON.stringify({
          userSatisfaction: feedback.userSatisfaction,
          ...feedback.metadata
        })},
        ${timestamp}
      )
    `
  }
  async getFeedbackStats(
    agentId: string,
    hours: number = 24
  ): Promise<FeedbackStats> {
    const cutoff = new Date(Date.now() - hours * 60 * 60 * 1000)
    const results = await this.db`
      SELECT
        outcome,
        COUNT(*) as count,
        AVG(completion_quality) as avg_quality,
        AVG((metadata->>'userSatisfaction')::int) as avg_satisfaction
      FROM agent_feedback
      WHERE agent_id = ${agentId} AND created_at > ${cutoff}
      GROUP BY outcome
    `
    const outcomeBreakdown: Record<FeedbackOutcome, number> = {
      success: 0,
      fallback: 0,
      timeout: 0,
      error: 0,
      user_rejected: 0
    }
    let totalFeedback = 0
    let totalQuality = 0
    let qualityCount = 0
    let totalSatisfaction = 0
    let satisfactionCount = 0
    for (const row of results as any[]) {
      const outcome = row.outcome as FeedbackOutcome
      const count = Number(row.count)
      outcomeBreakdown[outcome] = count
      totalFeedback += count
      if (row.avg_quality) {
        totalQuality += Number(row.avg_quality)
        qualityCount++
      }
      if (row.avg_satisfaction) {
        totalSatisfaction += Number(row.avg_satisfaction)
        satisfactionCount++
      }
    }
    return {
      agentId,
      totalFeedback,
      outcomeBreakdown,
      avgQuality: qualityCount > 0 ? totalQuality / qualityCount : 0,
      avgSatisfaction: satisfactionCount > 0 ? totalSatisfaction / satisfactionCount : 0
    }
  }
  async getOutcomeDistribution(
    agentId: string,
    hours: number = 24
  ): Promise<{ outcome: FeedbackOutcome; percentage: number }[]> {
    const cutoff = new Date(Date.now() - hours * 60 * 60 * 1000)
    const results = await this.db`
      SELECT outcome, COUNT(*) as count
      FROM agent_feedback
      WHERE agent_id = ${agentId} AND created_at > ${cutoff}
      GROUP BY outcome
    `
    const total = results.reduce((sum, row: any) => sum + Number(row.count), 0)
    if (total === 0) return []
    return results.map((row: any) => ({
      outcome: row.outcome as FeedbackOutcome,
      percentage: (Number(row.count) / total) * 100
    }))
  }
  async identifySuccessfulModels(agentId: string): Promise<string[]> {
    const cutoff = new Date(Date.now() - 24 * 60 * 60 * 1000)
    const results = await this.db`
      SELECT DISTINCT log.model
      FROM agent_request_log log
      INNER JOIN agent_feedback fb ON log.request_id = fb.request_id
      WHERE log.agent_id = ${agentId}
        AND log.created_at > ${cutoff}
        AND fb.outcome = 'success'
      ORDER BY log.model
    `
    return results.map((row: any) => row.model)
  }
  async computeConfidenceScore(
    agentId: string,
    model: string
  ): Promise<number> {
    const day = new Date()
    day.setDate(day.getDate() - 1)
    const feedback = await this.db`
      SELECT
        COUNT(CASE WHEN outcome = 'success' THEN 1 END)::float / COUNT(*) as success_rate,
        AVG(completion_quality) as avg_quality,
        AVG((metadata->>'userSatisfaction')::int) as avg_satisfaction
      FROM agent_feedback fb
      INNER JOIN agent_request_log log ON fb.request_id = log.request_id
      WHERE fb.agent_id = ${agentId}
        AND log.model = ${model}
        AND fb.created_at > ${day}
    `
    if (feedback.length === 0) return 0.5 // Default neutral confidence
    const row = feedback[0] as any
    const successRate = Number(row.success_rate || 0)
    const avgQuality = Number(row.avg_quality || 5) / 10 // Normalize to 0-1
    const avgSatisfaction = Number(row.avg_satisfaction || 5) / 10 // Normalize to 0-1
    // Weighted average: 50% success, 25% quality, 25% satisfaction
    const score = successRate * 0.5 + avgQuality * 0.25 + avgSatisfaction * 0.25
    return Math.min(1, Math.max(0, score))
  }
  async shouldUpdateConfidence(
    agentId: string,
    model: string,
    newScore: number,
    threshold: number = 0.1
  ): Promise<boolean> {
    // Get current confidence
    const current = await this.db`
      SELECT score FROM agent_confidence_scores
      WHERE agent_id = ${agentId} AND model = ${model}
    `
    if (current.length === 0) return true // Always update if no history
    const currentScore = Number(current[0].score)
    return Math.abs(newScore - currentScore) > threshold
  }
  async processUserFeedback(
    requestId: string,
    agentId: string,
    satisfaction: number // 0-10
  ): Promise<void> {
    const metadata = { userSatisfaction: satisfaction }
    await this.db`
      INSERT INTO agent_feedback (
        request_id, agent_id, outcome, metadata
      ) VALUES (
        ${requestId},
        ${agentId},
        'success',
        ${JSON.stringify(metadata)}
      )
      ON CONFLICT (request_id) DO UPDATE SET
        metadata = ${JSON.stringify(metadata)}
    `
  }
 }
 export default FeedbackProcessor
--- a/packages/learning-integration/src/index.ts
+++ b/packages/learning-integration/src/index.ts
@ -1,267 +0,0 @@
 import { sql } from 'postgres'
 import { Client } from 'postgres'
 export interface AgentMetrics {
  agentId: string
  model: string
  requestCount: number
  successRate: number
  avgLatencyMs: number
  totalTokens: number
  costUsd: number
  confidence: number
  updatedAt: Date
 }
 export interface RequestLog {
  requestId: string
  agentId: string
  model: string
  latencyMs: number
  tokensIn: number
  tokensOut: number
  confidence: number
  fallbackUsed: boolean
  success: boolean
  timestamp: Date
 }
 export interface AgentFeedback {
  requestId: string
  agentId: string
  outcome: 'success' | 'fallback' | 'timeout' | 'error' | 'user_rejected'
  completionQuality?: number
  latencyMs?: number
  tokenCount?: number
  metadata?: Record<string, unknown>
  timestamp: Date
 }
 export interface PerAgentConfidence {
  agentId: string
  model: string
  score: number
  sampleSize: number
  lastUpdated: Date
  trend: 'improving' | 'stable' | 'degrading'
 }
 export class LearningIntegration {
  private db: Client
  constructor(dbConnection: Client) {
    this.db = dbConnection
  }
  async initializeTables(): Promise<void> {
    await this.db`
      CREATE TABLE IF NOT EXISTS agent_request_log (
        request_id UUID PRIMARY KEY,
        agent_id VARCHAR(64) NOT NULL,
        model VARCHAR(128) NOT NULL,
        latency_ms INTEGER NOT NULL,
        tokens_in INTEGER NOT NULL,
        tokens_out INTEGER NOT NULL,
        confidence DECIMAL(3, 2) NOT NULL,
        fallback_used BOOLEAN NOT NULL DEFAULT FALSE,
        success BOOLEAN NOT NULL DEFAULT TRUE,
        created_at TIMESTAMP NOT NULL DEFAULT NOW(),
        INDEX idx_agent_model (agent_id, model),
        INDEX idx_created (created_at)
      )
    `
    await this.db`
      CREATE TABLE IF NOT EXISTS agent_feedback (
        id SERIAL PRIMARY KEY,
        request_id UUID NOT NULL,
        agent_id VARCHAR(64) NOT NULL,
        outcome VARCHAR(32) NOT NULL,
        completion_quality SMALLINT,
        latency_ms INTEGER,
        token_count INTEGER,
        metadata JSONB,
        created_at TIMESTAMP NOT NULL DEFAULT NOW(),
        FOREIGN KEY (request_id) REFERENCES agent_request_log (request_id),
        INDEX idx_agent_outcome (agent_id, outcome),
        INDEX idx_created (created_at)
      )
    `
    await this.db`
      CREATE TABLE IF NOT EXISTS agent_confidence_scores (
        id SERIAL PRIMARY KEY,
        agent_id VARCHAR(64) NOT NULL,
        model VARCHAR(128) NOT NULL,
        score DECIMAL(3, 2) NOT NULL,
        sample_size INTEGER NOT NULL DEFAULT 0,
        trend VARCHAR(16) NOT NULL DEFAULT 'stable',
        updated_at TIMESTAMP NOT NULL DEFAULT NOW(),
        UNIQUE (agent_id, model),
        INDEX idx_agent (agent_id)
      )
    `
  }
  async logRequest(log: Omit<RequestLog, 'timestamp'>): Promise<void> {
    await this.db`
      INSERT INTO agent_request_log (
        request_id, agent_id, model, latency_ms, tokens_in, tokens_out,
        confidence, fallback_used, success
      ) VALUES (
        ${log.requestId}, ${log.agentId}, ${log.model}, ${log.latencyMs},
        ${log.tokensIn}, ${log.tokensOut}, ${log.confidence},
        ${log.fallbackUsed}, ${log.success}
      )
    `
  }
  async recordFeedback(feedback: Omit<AgentFeedback, 'timestamp'>): Promise<void> {
    await this.db`
      INSERT INTO agent_feedback (
        request_id, agent_id, outcome, completion_quality, latency_ms,
        token_count, metadata
      ) VALUES (
        ${feedback.requestId}, ${feedback.agentId}, ${feedback.outcome},
        ${feedback.completionQuality || null}, ${feedback.latencyMs || null},
        ${feedback.tokenCount || null}, ${JSON.stringify(feedback.metadata || {})}
      )
    `
  }
  async getAgentMetrics(agentId: string, hours: number = 24): Promise<AgentMetrics[]> {
    const cutoffTime = new Date(Date.now() - hours * 60 * 60 * 1000)
    const results = await this.db`
      SELECT
        agent_id,
        model,
        COUNT(*) as request_count,
        COUNT(CASE WHEN success = true THEN 1 END)::float / COUNT(*) as success_rate,
        AVG(latency_ms)::float as avg_latency_ms,
        SUM(tokens_in + tokens_out) as total_tokens,
        SUM(tokens_in + tokens_out) * 0.0001 as cost_usd,
        AVG(confidence)::float as confidence,
        MAX(created_at) as updated_at
      FROM agent_request_log
      WHERE agent_id = ${agentId} AND created_at > ${cutoffTime}
      GROUP BY agent_id, model
      ORDER BY request_count DESC
    `
    return results.map((row: any) => ({
      agentId: row.agent_id,
      model: row.model,
      requestCount: Number(row.request_count),
      successRate: Number(row.success_rate),
      avgLatencyMs: Number(row.avg_latency_ms),
      totalTokens: Number(row.total_tokens),
      costUsd: Number(row.cost_usd),
      confidence: Number(row.confidence),
      updatedAt: new Date(row.updated_at)
    }))
  }
  async updateAgentConfidence(
    agentId: string,
    model: string,
    newScore: number
  ): Promise<void> {
    await this.db`
      INSERT INTO agent_confidence_scores (agent_id, model, score, sample_size)
      VALUES (${agentId}, ${model}, ${newScore}, 1)
      ON CONFLICT (agent_id, model)
      DO UPDATE SET
        score = ${newScore},
        sample_size = agent_confidence_scores.sample_size + 1,
        updated_at = NOW()
    `
  }
  async getAgentConfidence(agentId: string, model: string): Promise<PerAgentConfidence | null> {
    const results = await this.db`
      SELECT * FROM agent_confidence_scores
      WHERE agent_id = ${agentId} AND model = ${model}
    `
    if (results.length === 0) return null
    const row = results[0] as any
    return {
      agentId: row.agent_id,
      model: row.model,
      score: Number(row.score),
      sampleSize: Number(row.sample_size),
      lastUpdated: new Date(row.updated_at),
      trend: row.trend as 'improving' | 'stable' | 'degrading'
    }
  }
  async computePerAgentMetrics(agentId: string): Promise<AgentMetrics[]> {
    // Compute metrics for past 24 hours
    return this.getAgentMetrics(agentId, 24)
  }
  async getAgentCosts(days: number = 30): Promise<Map<string, number>> {
    const cutoffTime = new Date(Date.now() - days * 24 * 60 * 60 * 1000)
    const results = await this.db`
      SELECT
        agent_id,
        SUM(tokens_in + tokens_out) * 0.0001 as cost_usd
      FROM agent_request_log
      WHERE created_at > ${cutoffTime}
      GROUP BY agent_id
      ORDER BY cost_usd DESC
    `
    const costs = new Map<string, number>()
    for (const row of results as any[]) {
      costs.set(row.agent_id, Number(row.cost_usd))
    }
    return costs
  }
  async detectAnomalies(
    agentId: string,
    threshold: number = 2
  ): Promise<{ model: string; issue: string }[]> {
    const metrics = await this.getAgentMetrics(agentId, 24)
    const baseline = await this.getAgentMetrics(agentId, 24 * 30) // 30-day baseline
    const anomalies: { model: string; issue: string }[] = []
    for (const current of metrics) {
      const baselineMetric = baseline.find(m => m.model === current.model)
      if (!baselineMetric) continue
      // Check latency spike
      if (current.avgLatencyMs > baselineMetric.avgLatencyMs * (1 + threshold * 0.1)) {
        anomalies.push({
          model: current.model,
          issue: `Latency spike: ${current.avgLatencyMs}ms (baseline: ${baselineMetric.avgLatencyMs}ms)`
        })
      }
      // Check success rate drop
      if (current.successRate < baselineMetric.successRate * (1 - threshold * 0.1)) {
        anomalies.push({
          model: current.model,
          issue: `Success rate drop: ${(current.successRate * 100).toFixed(1)}% (baseline: ${(baselineMetric.successRate * 100).toFixed(1)}%)`
        })
      }
      // Check confidence drop
      if (current.confidence < baselineMetric.confidence * 0.8) {
        anomalies.push({
          model: current.model,
          issue: `Confidence degradation: ${current.confidence.toFixed(2)} (baseline: ${baselineMetric.confidence.toFixed(2)})`
        })
      }
    }
    return anomalies
  }
 }
 export default LearningIntegration
--- a/packages/learning-integration/src/metrics.ts
+++ b/packages/learning-integration/src/metrics.ts
@ -1,248 +0,0 @@
 import { Client } from 'postgres'
 import { sql } from 'postgres'
 export interface DailyAgentCost {
  date: string
  agentId: string
  tokensIn: number
  tokensOut: number
  costUsd: number
 }
 export interface LatencySLO {
  agentId: string
  targetMs: number
  p50: number
  p95: number
  p99: number
  breached: boolean
 }
 export class PerAgentMetrics {
  constructor(private db: Client) {}
  async generateDailyCostReport(date: string): Promise<DailyAgentCost[]> {
    const startOfDay = new Date(date)
    startOfDay.setHours(0, 0, 0, 0)
    const endOfDay = new Date(date)
    endOfDay.setHours(23, 59, 59, 999)
    const results = await this.db`
      SELECT
        agent_id,
        SUM(tokens_in) as tokens_in,
        SUM(tokens_out) as tokens_out,
        (SUM(tokens_in) + SUM(tokens_out)) * 0.0001 as cost_usd
      FROM agent_request_log
      WHERE created_at >= ${startOfDay} AND created_at < ${endOfDay}
      GROUP BY agent_id
      ORDER BY cost_usd DESC
    `
    return results.map((row: any) => ({
      date,
      agentId: row.agent_id,
      tokensIn: Number(row.tokens_in),
      tokensOut: Number(row.tokens_out),
      costUsd: Number(row.cost_usd)
    }))
  }
  async checkLatencySLO(
    agentId: string,
    targetMs: number = 500
  ): Promise<LatencySLO> {
    const hour = new Date()
    hour.setHours(hour.getHours() - 1)
    const results = await this.db`
      SELECT
        PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY latency_ms) as p50,
        PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY latency_ms) as p95,
        PERCENTILE_CONT(0.99) WITHIN GROUP (ORDER BY latency_ms) as p99
      FROM agent_request_log
      WHERE agent_id = ${agentId} AND created_at > ${hour}
    `
    if (results.length === 0) {
      return {
        agentId,
        targetMs,
        p50: 0,
        p95: 0,
        p99: 0,
        breached: false
      }
    }
    const row = results[0] as any
    const p99 = Number(row.p99 || 0)
    const breached = p99 > targetMs
    return {
      agentId,
      targetMs,
      p50: Number(row.p50 || 0),
      p95: Number(row.p95 || 0),
      p99,
      breached
    }
  }
  async getAgentSuccessRate(
    agentId: string,
    hours: number = 24
  ): Promise<{ total: number; successful: number; rate: number }> {
    const cutoff = new Date(Date.now() - hours * 60 * 60 * 1000)
    const results = await this.db`
      SELECT
        COUNT(*) as total,
        COUNT(CASE WHEN success = true THEN 1 END) as successful
      FROM agent_request_log
      WHERE agent_id = ${agentId} AND created_at > ${cutoff}
    `
    if (results.length === 0) {
      return { total: 0, successful: 0, rate: 0 }
    }
    const row = results[0] as any
    const total = Number(row.total)
    const successful = Number(row.successful)
    return {
      total,
      successful,
      rate: total > 0 ? successful / total : 0
    }
  }
  async getFallbackRate(
    agentId: string,
    hours: number = 24
  ): Promise<{ fallbacks: number; total: number; rate: number }> {
    const cutoff = new Date(Date.now() - hours * 60 * 60 * 1000)
    const results = await this.db`
      SELECT
        COUNT(*) as total,
        COUNT(CASE WHEN fallback_used = true THEN 1 END) as fallbacks
      FROM agent_request_log
      WHERE agent_id = ${agentId} AND created_at > ${cutoff}
    `
    if (results.length === 0) {
      return { fallbacks: 0, total: 0, rate: 0 }
    }
    const row = results[0] as any
    const total = Number(row.total)
    const fallbacks = Number(row.fallbacks)
    return {
      fallbacks,
      total,
      rate: total > 0 ? fallbacks / total : 0
    }
  }
  async getModelPerformance(
    agentId: string,
    model: string,
    hours: number = 24
  ): Promise<{
    model: string
    requests: number
    avgLatency: number
    successRate: number
    confidence: number
  }> {
    const cutoff = new Date(Date.now() - hours * 60 * 60 * 1000)
    const results = await this.db`
      SELECT
        COUNT(*) as requests,
        AVG(latency_ms) as avg_latency,
        COUNT(CASE WHEN success = true THEN 1 END)::float / COUNT(*) as success_rate,
        AVG(confidence) as confidence
      FROM agent_request_log
      WHERE agent_id = ${agentId} AND model = ${model} AND created_at > ${cutoff}
    `
    if (results.length === 0) {
      return { model, requests: 0, avgLatency: 0, successRate: 0, confidence: 0 }
    }
    const row = results[0] as any
    return {
      model,
      requests: Number(row.requests),
      avgLatency: Number(row.avg_latency || 0),
      successRate: Number(row.success_rate || 0),
      confidence: Number(row.confidence || 0)
    }
  }
  async compareAgentPerformance(): Promise<
    { agentId: string; requests: number; avgLatency: number; costUsd: number }[]
  > {
    const day = new Date()
    day.setDate(day.getDate() - 1)
    const startOfDay = new Date(day)
    startOfDay.setHours(0, 0, 0, 0)
    const results = await this.db`
      SELECT
        agent_id,
        COUNT(*) as requests,
        AVG(latency_ms) as avg_latency,
        (SUM(tokens_in) + SUM(tokens_out)) * 0.0001 as cost_usd
      FROM agent_request_log
      WHERE created_at >= ${startOfDay}
      GROUP BY agent_id
      ORDER BY requests DESC
    `
    return results.map((row: any) => ({
      agentId: row.agent_id,
      requests: Number(row.requests),
      avgLatency: Number(row.avg_latency || 0),
      costUsd: Number(row.cost_usd || 0)
    }))
  }
  async getAgentTrending(
    agentId: string,
    model: string
  ): Promise<{ trend: 'improving' | 'stable' | 'degrading'; signal: number }> {
    // Compare success rate: last 24h vs previous 24h
    const now24h = new Date(Date.now() - 24 * 60 * 60 * 1000)
    const now48h = new Date(Date.now() - 48 * 60 * 60 * 1000)
    const recent = await this.db`
      SELECT COUNT(CASE WHEN success = true THEN 1 END)::float / COUNT(*) as rate
      FROM agent_request_log
      WHERE agent_id = ${agentId} AND model = ${model} AND created_at > ${now24h}
    `
    const previous = await this.db`
      SELECT COUNT(CASE WHEN success = true THEN 1 END)::float / COUNT(*) as rate
      FROM agent_request_log
      WHERE agent_id = ${agentId} AND model = ${model}
        AND created_at > ${now48h} AND created_at <= ${now24h}
    `
    const recentRate = recent.length > 0 ? Number(recent[0].rate || 0) : 0
    const previousRate = previous.length > 0 ? Number(previous[0].rate || 0) : 0
    const signal = recentRate - previousRate
    let trend: 'improving' | 'stable' | 'degrading' = 'stable'
    if (signal > 0.05) trend = 'improving'
    if (signal < -0.05) trend = 'degrading'
    return { trend, signal }
  }
 }
 export default PerAgentMetrics
--- a/packages/learning-integration/tsconfig.json
+++ b/packages/learning-integration/tsconfig.json
@ -1,12 +0,0 @@
 {
  "extends": "../../tsconfig.json",
  "compilerOptions": {
    "outDir": "./dist",
    "rootDir": "./src",
    "declaration": true,
    "declarationMap": true,
    "sourceMap": true
  },
  "include": ["src/**/*"],
  "exclude": ["node_modules", "dist", "**/*.test.ts"]
 }