llm-gateway/deploy/cloudflare-tunnel.md
Rene Fichtmueller 3a00ff4d33 feat: initial llm-gateway implementation
- Complete Fastify gateway with 8-stage pipeline
- Circuit breaker (opossum) per model tier
- Rate limiting per caller
- Ban list validation (EN/DE/auto-detected)
- TIP validator (SFF-8024, part numbers, wavelengths)
- Prometheus metrics
- pg-boss async queue
- PostgreSQL audit log + review queue
- 9 prompt templates (TIP, LinkedIn, ShieldX)
- Learning engine scaffolding
- Auto-learning: ban-list, few-shot, routing, prompt optimizer
2026-04-02 22:48:55 +02:00

76 lines
1.7 KiB
Markdown

# Cloudflare Tunnel — LLM Gateway
Add the LLM Gateway to the existing Cloudflare Tunnel on Erik server.
## Current tunnel setup on Erik
Tunnels are managed by `cloudflared` running as a service. Config lives at:
```
~/.cloudflared/config.yml
```
or (if installed as root):
```
/etc/cloudflare-one/config.yml
```
## Add llm-gateway ingress rule
Edit the config file and add the following **before** the catch-all `http_status:404` rule:
```yaml
ingress:
# ... existing services ...
- hostname: llm-gateway.context-x.org
service: http://localhost:3100
originRequest:
connectTimeout: 10s
noHappyEyeballs: false
# Allow large LLM responses to stream without timeout
keepAliveTimeout: 130s
# Catch-all (must be last)
- service: http_status:404
```
## DNS record
In Cloudflare Dashboard → DNS → context-x.org:
| Type | Name | Target | Proxy |
|-------|-------------|-------------------------------|-------|
| CNAME | llm-gateway | `<tunnel-id>.cfargotunnel.com` | ON |
The tunnel ID can be found with:
```bash
ssh erik "cloudflared tunnel list"
```
## Reload tunnel
```bash
ssh erik "systemctl restart cloudflared"
# Verify:
curl https://llm-gateway.context-x.org/health/live
```
## Verify routing
```bash
# From any machine:
curl https://llm-gateway.context-x.org/health
# Expected:
# {"status":"ok","ollama":{...},"queue":{...}}
```
## Notes
- The tunnel connects directly to `localhost:3100` on Erik — nginx is **not** required.
- Cloudflare handles TLS termination and DDoS protection automatically.
- Rate limiting can be added via Cloudflare WAF rules on top of the gateway's built-in limits.