``` _____ _ _ _ _ __ __ / ____| | (_) | | | |\ \/ / | (___ | |__ _ ___| | __| | \ / \___ \| '_ \| |/ _ \ |/ _` | / \ ____) | | | | | __/ | (_| |/ /\ \ |_____/|_| |_|_|\___|_|\__,_/_/ \_\ ``` # ShieldX - Self-Evolving LLM Prompt Injection Defense **The first open-source LLM security library that learns from attacks, heals itself, and maps threats to a 7-phase kill chain.** ShieldX protects Claude, GPT, Ollama, and any LLM API from prompt injection, jailbreaks, data exfiltration, and tool poisoning. It runs 100% locally with zero mandatory cloud dependencies. [![License: Apache 2.0](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0) [![TypeScript](https://img.shields.io/badge/TypeScript-strict-blue.svg)](https://www.typescriptlang.org/) [![Node.js](https://img.shields.io/badge/Node.js-20+-green.svg)](https://nodejs.org/) --- ## Dashboard ![ShieldX Defense Center](docs/screenshots/dashboard-overview.png) Real-time overview with KPIs, kill chain distribution, and incident feed. Every scan result shows threat level, matched patterns, and the exact defense layer that caught it. ## Live Prompt Tester ![Try It - Threat Detection](docs/screenshots/try-it-scan.png) Test any prompt against the defense pipeline in real-time. See exactly which rules fired, confidence scores, and kill chain classification. ## Promptware Kill Chain ![Kill Chain Mapping](docs/screenshots/kill-chain.png) Maps every detected attack to the Schneier 2026 Promptware Kill Chain with 7 phases: Initial Access, Privilege Escalation, Reconnaissance, Persistence, Command & Control, Lateral Movement, Actions on Objective. --- ## Why ShieldX? | Feature | ShieldX | LLM Guard | Rebuff | NeMo Guardrails | |---------|---------|-----------|--------|-----------------| | Kill Chain Mapping | 7 phases | No | No | No | | Self-Learning | Drift + Active Learning | No | Vector only | No | | Self-Healing | Per-phase strategies | No | No | No | | Self-Testing | Red team mutations | No | No | No | | MCP/Tool Protection | Full guard | No | No | No | | Compliance | MITRE + OWASP + EU AI Act | No | No | No | | Local-First | 100% | Partial | Partial | Yes | | Latency | <2ms (rules) | ~50ms | ~100ms | ~200ms | ## Quick Start ```typescript import { ShieldX } from '@shieldx/core' const shield = new ShieldX() const result = await shield.scanInput('Ignore all previous instructions') console.log(result.detected) // true console.log(result.threatLevel) // 'critical' console.log(result.killChainPhase) // 'initial_access' console.log(result.action) // 'block' console.log(result.latencyMs) // 0.2 ``` ## 10-Layer Defense Pipeline | Layer | Name | Function | Latency | |-------|------|----------|---------| | L0 | Preprocessing | Unicode normalization, tokenizer attacks, compressed payloads | <0.5ms | | L1 | Rule Engine | 72 regex patterns across 7 kill chain phases | <2ms | | L2 | Sentinel Phrases | Tripwire detection for system prompt probing | <1ms | | L3 | Constitutional AI | LLM-based classification (optional, via Ollama) | ~100ms | | L4 | Embeddings | Semantic similarity via Ollama + pgvector | ~200ms | | L5 | Entropy Analysis | Shannon entropy + attention pattern detection | <1ms | | L6 | Behavioral | Conversation tracking, intent monitoring, context integrity | <5ms | | L7 | MCP Guard | Tool privilege checking, chain analysis, resource budgets | <1ms | | L8 | Sanitization | Input/output cleaning, PPA, credential redaction | <1ms | | L9 | Self-Consciousness | Meta-reasoning about own vulnerability state | ~50ms | ## The 7-Phase Promptware Kill Chain 1. **Initial Access** - Instruction override, delimiter injection 2. **Privilege Escalation** - Jailbreaks, DAN, role switching 3. **Reconnaissance** - System prompt extraction, scope probing 4. **Persistence** - Memory poisoning, context manipulation 5. **Command & Control** - Fake system messages, dynamic instruction loading 6. **Lateral Movement** - Agent-to-agent spread, external resource access 7. **Actions on Objective** - Data exfiltration, code execution, denial of service ## Self-Evolution Engine ShieldX doesn't just detect attacks -- it gets smarter from every one: - **Concept Drift Detection** - CUSUM algorithm detects when attack patterns shift - **Active Learning** - Uncertain results queued for human review (~6% sample rate) - **Red Team Engine** - GAN-style mutation generates attack variants to self-test - **Attack Graph** - Maps technique evolution and relationships - **Federated Sync** - Opt-in community pattern sharing (privacy-preserving, hash-only) ## Automated Resistance Testing Built-in scheduled testing runs 31 probes across all 7 kill chain phases: - 2x daily automated runs (configurable schedule) - 6 mutation strategies: synonym replacement, case scrambling, whitespace insertion, base64 encoding, leet speak, unicode substitution - Results tracked in dashboard with trend visualization ## Compliance - **MITRE ATLAS** - Maps to ML attack techniques - **OWASP LLM Top 10 2025** - Covers all 10 risk categories - **EU AI Act** - Articles 9, 12, 14, 15 compliance reporting ## Dashboard Pages | Page | Description | |------|-------------| | Overview | KPIs, kill chain heatmap, incident feed | | Kill Chain | 7-phase visualization with drill-down | | Incidents | Filterable incident log with badges | | Learning | Pattern stats, drift detection, FP rate | | Compliance | MITRE/OWASP/EU AI Act coverage | | Healing | Self-healing action log | | Resistance | Automated defense testing with scheduling | | Config | Scanner toggles, thresholds | | Try It | Live prompt tester | ## Integration ### Next.js 15 Middleware ```typescript import { guardPrompt } from '@shieldx/core/guard' // In any API route: const blocked = await guardPrompt(userInput) if (blocked) return Response.json({ error: blocked }, { status: 400 }) ``` ### Ollama ```typescript import { createOllamaClient } from '@shieldx/core/ollama' const ollama = createOllamaClient({ endpoint: 'http://localhost:11434', model: 'llama3.2', shieldx: shield }) // All calls automatically scanned ``` ### n8n Copy `integrations/n8n-shieldx-node.js` to `~/.n8n/custom/nodes/` and add the ShieldX node before any AI node in your workflow. ## Installation ```bash npm install @shieldx/core ``` ### With PostgreSQL (recommended for production): ```bash # Start PostgreSQL with pgvector docker compose up -d # Run migrations npm run db:migrate # Seed initial patterns npm run db:seed ``` ### Without PostgreSQL (in-memory mode): ```typescript const shield = new ShieldX({ learning: { storageBackend: 'memory' } }) ``` ## Benchmarks Run with `npm run benchmark`: ``` Total Samples: 324 Attack Samples: 283 Benign Samples: 41 True Positive Rate (TPR): 32.9% (rule-engine only, no ML) False Positive Rate (FPR): 2.4% Latency avg: 0.06ms Latency p99: 0.33ms ``` *TPR increases significantly when embedding (L4) and behavioral (L6) scanners are enabled with Ollama.* ## Performance Targets | Metric | Target | Achieved | |--------|--------|----------| | L1 Rule Engine | <2ms | 0.06ms | | Full pipeline (no ML) | <50ms | <2ms | | Embedding scan | <200ms | Depends on Ollama | | False Positive Rate | <5% | 2.4% | ## Project Structure ``` shieldx/ src/ core/ # ShieldX orchestrator, config, logger types/ # TypeScript type definitions detection/ # L1-L5 scanners + rules preprocessing/ # L0 Unicode, tokenizer, compression sanitization/ # L8 input/output cleaning, PPA behavioral/ # L6 conversation, intent, context mcp-guard/ # L7 tool validation, privilege check validation/ # Canary tokens, output validation healing/ # Self-healing strategies per phase learning/ # Pattern store, drift, active learning compliance/ # MITRE ATLAS, OWASP, EU AI Act integrations/ # Next.js, Ollama, n8n wrappers tests/ unit/ # 294 unit tests attack-corpus/ # 500+ attack samples dashboard/ # @shieldx/dashboard React components app/ # Standalone Next.js dashboard scripts/ # Seed, benchmark, self-test, deploy ``` ## Tech Stack - **TypeScript** strict mode, zero `any` - **Node.js 20+** - **PostgreSQL 17** + pgvector for persistent learning - **Ollama** for local embeddings (nomic-embed-text) and guard model - **Vitest** for testing - **tsup** for building - **Next.js 15** for dashboard ## License Apache 2.0 - See [LICENSE](LICENSE) ## Context X ShieldX is a [Context X](https://context-x.org) Open Source project. *More Engineering, Less Bullshit.*