- Complete Fastify gateway with 8-stage pipeline - Circuit breaker (opossum) per model tier - Rate limiting per caller - Ban list validation (EN/DE/auto-detected) - TIP validator (SFF-8024, part numbers, wavelengths) - Prometheus metrics - pg-boss async queue - PostgreSQL audit log + review queue - 9 prompt templates (TIP, LinkedIn, ShieldX) - Learning engine scaffolding - Auto-learning: ban-list, few-shot, routing, prompt optimizer
76 lines
1.7 KiB
Markdown
76 lines
1.7 KiB
Markdown
# Cloudflare Tunnel — LLM Gateway
|
|
|
|
Add the LLM Gateway to the existing Cloudflare Tunnel on Erik server.
|
|
|
|
## Current tunnel setup on Erik
|
|
|
|
Tunnels are managed by `cloudflared` running as a service. Config lives at:
|
|
|
|
```
|
|
~/.cloudflared/config.yml
|
|
```
|
|
|
|
or (if installed as root):
|
|
|
|
```
|
|
/etc/cloudflare-one/config.yml
|
|
```
|
|
|
|
## Add llm-gateway ingress rule
|
|
|
|
Edit the config file and add the following **before** the catch-all `http_status:404` rule:
|
|
|
|
```yaml
|
|
ingress:
|
|
# ... existing services ...
|
|
|
|
- hostname: llm-gateway.context-x.org
|
|
service: http://localhost:3100
|
|
originRequest:
|
|
connectTimeout: 10s
|
|
noHappyEyeballs: false
|
|
# Allow large LLM responses to stream without timeout
|
|
keepAliveTimeout: 130s
|
|
|
|
# Catch-all (must be last)
|
|
- service: http_status:404
|
|
```
|
|
|
|
## DNS record
|
|
|
|
In Cloudflare Dashboard → DNS → context-x.org:
|
|
|
|
| Type | Name | Target | Proxy |
|
|
|-------|-------------|-------------------------------|-------|
|
|
| CNAME | llm-gateway | `<tunnel-id>.cfargotunnel.com` | ON |
|
|
|
|
The tunnel ID can be found with:
|
|
|
|
```bash
|
|
ssh erik "cloudflared tunnel list"
|
|
```
|
|
|
|
## Reload tunnel
|
|
|
|
```bash
|
|
ssh erik "systemctl restart cloudflared"
|
|
# Verify:
|
|
curl https://llm-gateway.context-x.org/health/live
|
|
```
|
|
|
|
## Verify routing
|
|
|
|
```bash
|
|
# From any machine:
|
|
curl https://llm-gateway.context-x.org/health
|
|
|
|
# Expected:
|
|
# {"status":"ok","ollama":{...},"queue":{...}}
|
|
```
|
|
|
|
## Notes
|
|
|
|
- The tunnel connects directly to `localhost:3100` on Erik — nginx is **not** required.
|
|
- Cloudflare handles TLS termination and DDoS protection automatically.
|
|
- Rate limiting can be added via Cloudflare WAF rules on top of the gateway's built-in limits.
|