Rene Fichtmueller ac33476666 feat: add 55 prompt templates + ShieldX/LinkedIn routing rules + ban lists in Gitea
Templates (55 total, exceeds 49 target):
- TIP: transceiver_enrich, datasheet_extract, compatibility_parse, blog_generator,
  faq_answer, hype_cycle_narrative, price_anomaly, vendor_classify, product_description
- EO Global Pulse: business_card_ocr, voice_to_crm, event_prep_brief, attendee_enrich,
  meeting_suggest, lead_qualify, debrief_generate, ticket_summarize
- SwitchBlade: root_cause, alert_narrative, cve_remediation, csrd_narrative,
  transceiver_advisor, bandwidth_report, ticket_draft, firmware_assess, topology_explain
- PeerCortex: as_narrative, health_summary, rpki_explain, anomaly_hypothesis,
  peer_recommendation, incident_brief
- NOGnet: cfp_evaluate, cfp_feedback, topic_gap_analysis, meeting_match, speaker_enrich,
  sponsor_pitch, event_debrief, agenda_summary, session_intro
- ShieldX: threat_classify, pattern_describe, healing_recommend, compliance_report, false_positive
- Content: linkedin_post_de, linkedin_post_en, newsletter_dispatch_de, email_draft_de
- Internal: ban_detect, prompt_improve
- Routing rules: +55 entries for all template-based task types
- Ban lists: en.csv, de.csv, auto.csv created in Gitea (llm-banlists repo)
2026-04-02 23:14:30 +02:00

LLM Gateway

Centralized AI inference layer for all Context X projects. Routes requests to local Ollama models on Mac Studio (192.168.178.169), validates outputs with ShieldX, and records all interactions for the self-improving learning engine.

Port: 3100 Production: http://llm-gateway.context-x.org (Cloudflare Tunnel → Erik)


Architecture

Projects (TIP, EO Pulse, SwitchBlade, PeerCortex, NOGnet, CtxEvent)
    ↓  @llm-gateway/client
LLM Gateway :3100
    ├── Prompt Engine   (versioned templates per task_type)
    ├── ShieldX Guard   (prompt injection validation)
    ├── Ollama Router   (model tier selection: 3b / 14b / 32b / 70b)
    └── Learning Engine (feedback loop, self-improvement)
         ↓
    PostgreSQL (llm_gateway DB)
    Ollama     (Mac Studio :11434)

Prerequisites

Dependency Version Notes
Node.js 22+ node --version
PostgreSQL 17 Local or remote
Ollama latest Running on Mac Studio .169
PM2 latest npm install -g pm2 (Erik)

1. Local Development Setup

# Clone
git clone http://gitea.context-x.org/rene/llm-gateway.git
cd llm-gateway

# Install all workspace dependencies
npm install

# Copy and configure environment
cp .env.example .env
# Edit .env: set DATABASE_URL, OLLAMA_URL at minimum

# Initialize database
bash scripts/init-db.sh

# Pull required Ollama models (runs against OLLAMA_URL from .env)
bash scripts/pull-models.sh

# Start gateway
npm run dev

# In a separate terminal: start learning engine
npm run learning

Gateway is available at http://localhost:3100.


2. Environment Variables

See .env.example for all variables with descriptions.

Variable Required Default Description
DATABASE_URL YES PostgreSQL DSN for llm_gateway
TIP_DATABASE_URL NO TIP DB (read-only)
OLLAMA_URL YES http://...169:11434 Ollama inference server
SHIELDX_URL NO ShieldX endpoint (leave blank to skip)
PORT NO 3100 HTTP port
LOG_LEVEL NO info error / warn / info / debug

3. Running Migrations

# Full init (create DB + user + run all migrations)
bash scripts/init-db.sh

# Custom Postgres host (e.g. Erik)
PGHOST=217.154.82.179 PGPORT=5432 bash scripts/init-db.sh

Migration files live in:

  • packages/gateway/src/db/migrations/001_initial.sql
  • packages/learning/src/db/migrations/002_learning.sql

4. Pulling Ollama Models

bash scripts/pull-models.sh

# Against a different Ollama instance:
OLLAMA_URL=http://localhost:11434 bash scripts/pull-models.sh

Required models:

Model Tier Use case
qwen2.5:3b Fast Low-complexity, sub-second tasks
qwen2.5:14b Medium Standard completions
qwen2.5:32b Large Complex analysis
deepseek-r1:14b Reasoning Step-by-step logic
llama3.3:70b Premium Best quality, used sparingly

5. API Usage

Completion

curl -X POST http://localhost:3100/v1/completion \
  -H "Content-Type: application/json" \
  -d '{
    "caller": "my-project",
    "task_type": "summarize",
    "input": "Long document text here...",
    "language": "en"
  }'

Response:

{
  "request_id": "uuid",
  "status": "approved",
  "output": "Summary...",
  "confidence": 0.92,
  "model_used": "qwen2.5:14b",
  "prompt_version": "summarize/v2",
  "token_count": { "input": 512, "output": 128 },
  "latency_ms": 1240
}

Classify input

curl -X POST http://localhost:3100/v1/classify \
  -H "Content-Type: application/json" \
  -d '{ "caller": "my-project", "input": "What transceivers work with Cisco ASR9k?" }'

Health

curl http://localhost:3100/health
curl http://localhost:3100/health/live   # liveness probe (k8s / Docker)
curl http://localhost:3100/health/ready  # readiness probe

6. Project-specific Client Usage

Install the client in any workspace project:

npm install @llm-gateway/client

TIP (Transceiver Intelligence Platform)

import { createTIPClient } from '@llm-gateway/client';

const llm = createTIPClient(); // reads LLM_GATEWAY_URL from env

const result = await llm.completion({
  task_type: 'extract_specs',
  input: rawHtml,
  context: { vendor: 'Cisco', sku: 'SFP-10G-SR' },
});

if (result.status === 'approved') {
  console.log(result.output);
}

EO Global Pulse

import { createEOPulseClient } from '@llm-gateway/client';

const llm = createEOPulseClient();

// Safe completion: returns null when gateway is down (graceful degradation)
const result = await llm.safeCompletion({
  task_type: 'meeting_summary',
  input: transcriptText,
  language: 'de',
});

SwitchBlade

import { createSwitchBladeClient } from '@llm-gateway/client';

const llm = createSwitchBladeClient();

const { batch_id } = await llm.batch(
  tasks.map(t => ({ task_type: 'analyze_alert', input: t.raw })),
  'http://switchblade.context-x.org/webhooks/llm-batch',
);

Custom client (any project)

import { LLMGatewayClient } from '@llm-gateway/client';

const llm = new LLMGatewayClient({
  caller: 'my-service',
  baseUrl: process.env.LLM_GATEWAY_URL,
  timeout: 20_000,
});

7. Deployment to Erik

One-command deploy (from local Mac)

bash deploy/deploy.sh

# Skip local build (if already built):
bash deploy/deploy.sh --skip-build

# Health check only:
bash deploy/deploy.sh --health-only

First-time setup on Erik

# SSH to Erik
ssh root@217.154.82.179

# Run setup script (idempotent — safe to re-run)
cd /opt/llm-gateway
bash deploy/setup-erik.sh

PM2 management

ssh erik "pm2 status"
ssh erik "pm2 logs llm-gateway"
ssh erik "pm2 logs llm-learning"
ssh erik "pm2 restart llm-gateway"
ssh erik "pm2 monit"

8. Monitoring

Prometheus metrics

GET http://localhost:3100/metrics

Grafana

Metrics are scraped by the existing Prometheus instance. Import the dashboard from deploy/grafana-dashboard.json (if present).

Key metrics to watch

Metric Alert threshold
gateway_request_latency_p99 > 5 000 ms
gateway_error_rate > 5%
ollama_queue_depth > 20
learning_feedback_lag > 1 h

Log locations (Erik)

/var/log/llm-gateway/out.log           # gateway stdout
/var/log/llm-gateway/error.log         # gateway stderr
/var/log/llm-gateway/learning-out.log  # learning engine stdout
/var/log/llm-gateway/learning-error.log

9. Cloudflare Tunnel

See deploy/cloudflare-tunnel.md for instructions to expose the gateway via https://llm-gateway.context-x.org.


10. Docker (alternative to PM2)

# Build and start all services
cp .env.example .env   # fill in DATABASE_URL
docker compose up -d

# Check status
docker compose ps
docker compose logs llm-gateway

# Stop
docker compose down

Repository structure

llm-gateway/
├── packages/
│   ├── gateway/         # Core HTTP server (Express + Ollama + ShieldX)
│   │   ├── src/
│   │   │   ├── server.ts
│   │   │   ├── routes/
│   │   │   ├── db/
│   │   │   │   └── migrations/
│   │   │   └── prompts/
│   │   └── prompts/     # Versioned prompt templates
│   ├── learning/        # Self-improving feedback engine
│   │   └── src/
│   └── client/          # @llm-gateway/client TypeScript library
│       └── src/index.ts
├── deploy/
│   ├── setup-erik.sh       # First-time server setup
│   ├── deploy.sh           # One-command local → Erik deploy
│   ├── ecosystem.config.cjs # PM2 config
│   ├── nginx.conf          # Optional nginx reverse proxy
│   └── cloudflare-tunnel.md
├── scripts/
│   ├── init-db.sh          # Database initialization
│   └── pull-models.sh      # Pull Ollama models
├── Dockerfile
├── docker-compose.yaml
├── .env.example
└── package.json            # npm workspaces root
Description
Unified LLM orchestration layer for TIP, EO Global Pulse, PeerCortex, SwitchBlade, NOGnet, ShieldX
Readme 112 MiB
Languages
TypeScript 52.1%
Python 24.4%
HTML 18.7%
Shell 3.8%
JavaScript 0.9%
Other 0.1%