llm-gateway/deploy/cloudflare-tunnel.md
Rene Fichtmueller 3a00ff4d33 feat: initial llm-gateway implementation
- Complete Fastify gateway with 8-stage pipeline
- Circuit breaker (opossum) per model tier
- Rate limiting per caller
- Ban list validation (EN/DE/auto-detected)
- TIP validator (SFF-8024, part numbers, wavelengths)
- Prometheus metrics
- pg-boss async queue
- PostgreSQL audit log + review queue
- 9 prompt templates (TIP, LinkedIn, ShieldX)
- Learning engine scaffolding
- Auto-learning: ban-list, few-shot, routing, prompt optimizer
2026-04-02 22:48:55 +02:00

1.7 KiB

Cloudflare Tunnel — LLM Gateway

Add the LLM Gateway to the existing Cloudflare Tunnel on Erik server.

Current tunnel setup on Erik

Tunnels are managed by cloudflared running as a service. Config lives at:

~/.cloudflared/config.yml

or (if installed as root):

/etc/cloudflare-one/config.yml

Add llm-gateway ingress rule

Edit the config file and add the following before the catch-all http_status:404 rule:

ingress:
  # ... existing services ...

  - hostname: llm-gateway.context-x.org
    service: http://localhost:3100
    originRequest:
      connectTimeout: 10s
      noHappyEyeballs: false
      # Allow large LLM responses to stream without timeout
      keepAliveTimeout: 130s

  # Catch-all (must be last)
  - service: http_status:404

DNS record

In Cloudflare Dashboard → DNS → context-x.org:

Type Name Target Proxy
CNAME llm-gateway <tunnel-id>.cfargotunnel.com ON

The tunnel ID can be found with:

ssh erik "cloudflared tunnel list"

Reload tunnel

ssh erik "systemctl restart cloudflared"
# Verify:
curl https://llm-gateway.context-x.org/health/live

Verify routing

# From any machine:
curl https://llm-gateway.context-x.org/health

# Expected:
# {"status":"ok","ollama":{...},"queue":{...}}

Notes

  • The tunnel connects directly to localhost:3100 on Erik — nginx is not required.
  • Cloudflare handles TLS termination and DDoS protection automatically.
  • Rate limiting can be added via Cloudflare WAF rules on top of the gateway's built-in limits.