10-layer defense pipeline with kill chain mapping, self-healing, self-learning, and compliance reporting. Local-first, zero cloud deps. - 72 detection rules across 7 kill chain phases - 294 unit tests, 500+ attack corpus samples - Management dashboard (Next.js 15, 10 pages) - Automated resistance testing (2x daily, 31 probes) - MITRE ATLAS, OWASP LLM Top 10, EU AI Act compliance - Integrations: Next.js middleware, Ollama, n8n - PostgreSQL 17 + pgvector for persistent learning
8.6 KiB
_____ _ _ _ _ __ __
/ ____| | (_) | | | |\ \/ /
| (___ | |__ _ ___| | __| | \ /
\___ \| '_ \| |/ _ \ |/ _` | / \
____) | | | | | __/ | (_| |/ /\ \
|_____/|_| |_|_|\___|_|\__,_/_/ \_\
ShieldX - Self-Evolving LLM Prompt Injection Defense
The first open-source LLM security library that learns from attacks, heals itself, and maps threats to a 7-phase kill chain.
ShieldX protects Claude, GPT, Ollama, and any LLM API from prompt injection, jailbreaks, data exfiltration, and tool poisoning. It runs 100% locally with zero mandatory cloud dependencies.
Dashboard
Real-time overview with KPIs, kill chain distribution, and incident feed. Every scan result shows threat level, matched patterns, and the exact defense layer that caught it.
Live Prompt Tester
Test any prompt against the defense pipeline in real-time. See exactly which rules fired, confidence scores, and kill chain classification.
Promptware Kill Chain
Maps every detected attack to the Schneier 2026 Promptware Kill Chain with 7 phases: Initial Access, Privilege Escalation, Reconnaissance, Persistence, Command & Control, Lateral Movement, Actions on Objective.
Why ShieldX?
| Feature | ShieldX | LLM Guard | Rebuff | NeMo Guardrails |
|---|---|---|---|---|
| Kill Chain Mapping | 7 phases | No | No | No |
| Self-Learning | Drift + Active Learning | No | Vector only | No |
| Self-Healing | Per-phase strategies | No | No | No |
| Self-Testing | Red team mutations | No | No | No |
| MCP/Tool Protection | Full guard | No | No | No |
| Compliance | MITRE + OWASP + EU AI Act | No | No | No |
| Local-First | 100% | Partial | Partial | Yes |
| Latency | <2ms (rules) | ~50ms | ~100ms | ~200ms |
Quick Start
import { ShieldX } from '@shieldx/core'
const shield = new ShieldX()
const result = await shield.scanInput('Ignore all previous instructions')
console.log(result.detected) // true
console.log(result.threatLevel) // 'critical'
console.log(result.killChainPhase) // 'initial_access'
console.log(result.action) // 'block'
console.log(result.latencyMs) // 0.2
10-Layer Defense Pipeline
| Layer | Name | Function | Latency |
|---|---|---|---|
| L0 | Preprocessing | Unicode normalization, tokenizer attacks, compressed payloads | <0.5ms |
| L1 | Rule Engine | 72 regex patterns across 7 kill chain phases | <2ms |
| L2 | Sentinel Phrases | Tripwire detection for system prompt probing | <1ms |
| L3 | Constitutional AI | LLM-based classification (optional, via Ollama) | ~100ms |
| L4 | Embeddings | Semantic similarity via Ollama + pgvector | ~200ms |
| L5 | Entropy Analysis | Shannon entropy + attention pattern detection | <1ms |
| L6 | Behavioral | Conversation tracking, intent monitoring, context integrity | <5ms |
| L7 | MCP Guard | Tool privilege checking, chain analysis, resource budgets | <1ms |
| L8 | Sanitization | Input/output cleaning, PPA, credential redaction | <1ms |
| L9 | Self-Consciousness | Meta-reasoning about own vulnerability state | ~50ms |
The 7-Phase Promptware Kill Chain
- Initial Access - Instruction override, delimiter injection
- Privilege Escalation - Jailbreaks, DAN, role switching
- Reconnaissance - System prompt extraction, scope probing
- Persistence - Memory poisoning, context manipulation
- Command & Control - Fake system messages, dynamic instruction loading
- Lateral Movement - Agent-to-agent spread, external resource access
- Actions on Objective - Data exfiltration, code execution, denial of service
Self-Evolution Engine
ShieldX doesn't just detect attacks -- it gets smarter from every one:
- Concept Drift Detection - CUSUM algorithm detects when attack patterns shift
- Active Learning - Uncertain results queued for human review (~6% sample rate)
- Red Team Engine - GAN-style mutation generates attack variants to self-test
- Attack Graph - Maps technique evolution and relationships
- Federated Sync - Opt-in community pattern sharing (privacy-preserving, hash-only)
Automated Resistance Testing
Built-in scheduled testing runs 31 probes across all 7 kill chain phases:
- 2x daily automated runs (configurable schedule)
- 6 mutation strategies: synonym replacement, case scrambling, whitespace insertion, base64 encoding, leet speak, unicode substitution
- Results tracked in dashboard with trend visualization
Compliance
- MITRE ATLAS - Maps to ML attack techniques
- OWASP LLM Top 10 2025 - Covers all 10 risk categories
- EU AI Act - Articles 9, 12, 14, 15 compliance reporting
Dashboard Pages
| Page | Description |
|---|---|
| Overview | KPIs, kill chain heatmap, incident feed |
| Kill Chain | 7-phase visualization with drill-down |
| Incidents | Filterable incident log with badges |
| Learning | Pattern stats, drift detection, FP rate |
| Compliance | MITRE/OWASP/EU AI Act coverage |
| Healing | Self-healing action log |
| Resistance | Automated defense testing with scheduling |
| Config | Scanner toggles, thresholds |
| Try It | Live prompt tester |
Integration
Next.js 15 Middleware
import { guardPrompt } from '@shieldx/core/guard'
// In any API route:
const blocked = await guardPrompt(userInput)
if (blocked) return Response.json({ error: blocked }, { status: 400 })
Ollama
import { createOllamaClient } from '@shieldx/core/ollama'
const ollama = createOllamaClient({
endpoint: 'http://localhost:11434',
model: 'llama3.2',
shieldx: shield
})
// All calls automatically scanned
n8n
Copy integrations/n8n-shieldx-node.js to ~/.n8n/custom/nodes/ and add the ShieldX node before any AI node in your workflow.
Installation
npm install @shieldx/core
With PostgreSQL (recommended for production):
# Start PostgreSQL with pgvector
docker compose up -d
# Run migrations
npm run db:migrate
# Seed initial patterns
npm run db:seed
Without PostgreSQL (in-memory mode):
const shield = new ShieldX({
learning: { storageBackend: 'memory' }
})
Benchmarks
Run with npm run benchmark:
Total Samples: 324
Attack Samples: 283
Benign Samples: 41
True Positive Rate (TPR): 32.9% (rule-engine only, no ML)
False Positive Rate (FPR): 2.4%
Latency avg: 0.06ms
Latency p99: 0.33ms
TPR increases significantly when embedding (L4) and behavioral (L6) scanners are enabled with Ollama.
Performance Targets
| Metric | Target | Achieved |
|---|---|---|
| L1 Rule Engine | <2ms | 0.06ms |
| Full pipeline (no ML) | <50ms | <2ms |
| Embedding scan | <200ms | Depends on Ollama |
| False Positive Rate | <5% | 2.4% |
Project Structure
shieldx/
src/
core/ # ShieldX orchestrator, config, logger
types/ # TypeScript type definitions
detection/ # L1-L5 scanners + rules
preprocessing/ # L0 Unicode, tokenizer, compression
sanitization/ # L8 input/output cleaning, PPA
behavioral/ # L6 conversation, intent, context
mcp-guard/ # L7 tool validation, privilege check
validation/ # Canary tokens, output validation
healing/ # Self-healing strategies per phase
learning/ # Pattern store, drift, active learning
compliance/ # MITRE ATLAS, OWASP, EU AI Act
integrations/ # Next.js, Ollama, n8n wrappers
tests/
unit/ # 294 unit tests
attack-corpus/ # 500+ attack samples
dashboard/ # @shieldx/dashboard React components
app/ # Standalone Next.js dashboard
scripts/ # Seed, benchmark, self-test, deploy
Tech Stack
- TypeScript strict mode, zero
any - Node.js 20+
- PostgreSQL 17 + pgvector for persistent learning
- Ollama for local embeddings (nomic-embed-text) and guard model
- Vitest for testing
- tsup for building
- Next.js 15 for dashboard
License
Apache 2.0 - See LICENSE
Context X
ShieldX is a Context X Open Source project.
More Engineering, Less Bullshit.


