- Complete Fastify gateway with 8-stage pipeline - Circuit breaker (opossum) per model tier - Rate limiting per caller - Ban list validation (EN/DE/auto-detected) - TIP validator (SFF-8024, part numbers, wavelengths) - Prometheus metrics - pg-boss async queue - PostgreSQL audit log + review queue - 9 prompt templates (TIP, LinkedIn, ShieldX) - Learning engine scaffolding - Auto-learning: ban-list, few-shot, routing, prompt optimizer
1.7 KiB
1.7 KiB
Cloudflare Tunnel — LLM Gateway
Add the LLM Gateway to the existing Cloudflare Tunnel on Erik server.
Current tunnel setup on Erik
Tunnels are managed by cloudflared running as a service. Config lives at:
~/.cloudflared/config.yml
or (if installed as root):
/etc/cloudflare-one/config.yml
Add llm-gateway ingress rule
Edit the config file and add the following before the catch-all http_status:404 rule:
ingress:
# ... existing services ...
- hostname: llm-gateway.context-x.org
service: http://localhost:3100
originRequest:
connectTimeout: 10s
noHappyEyeballs: false
# Allow large LLM responses to stream without timeout
keepAliveTimeout: 130s
# Catch-all (must be last)
- service: http_status:404
DNS record
In Cloudflare Dashboard → DNS → context-x.org:
| Type | Name | Target | Proxy |
|---|---|---|---|
| CNAME | llm-gateway | <tunnel-id>.cfargotunnel.com |
ON |
The tunnel ID can be found with:
ssh erik "cloudflared tunnel list"
Reload tunnel
ssh erik "systemctl restart cloudflared"
# Verify:
curl https://llm-gateway.context-x.org/health/live
Verify routing
# From any machine:
curl https://llm-gateway.context-x.org/health
# Expected:
# {"status":"ok","ollama":{...},"queue":{...}}
Notes
- The tunnel connects directly to
localhost:3100on Erik — nginx is not required. - Cloudflare handles TLS termination and DDoS protection automatically.
- Rate limiting can be added via Cloudflare WAF rules on top of the gateway's built-in limits.