shieldx/README.md

```
   _____ _     _      _     _ __  __
  / ____| |   (_)    | |   | |\ \/ /
 | (___ | |__  _  ___| | __| | \  /
  \___ \| '_ \| |/ _ \ |/ _` | /  \
  ____) | | | | |  __/ | (_| |/ /\ \
 |_____/|_| |_|_|\___|_|\__,_/_/  \_\
```

# ShieldX - Self-Evolving LLM Prompt Injection Defense

**The first open-source LLM security library that learns from attacks, heals itself, and maps threats to a 7-phase kill chain.**

ShieldX protects Claude, GPT, Ollama, and any LLM API from prompt injection, jailbreaks, data exfiltration, and tool poisoning. It runs 100% locally with zero mandatory cloud dependencies.

[![License: Apache 2.0](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![TypeScript](https://img.shields.io/badge/TypeScript-strict-blue.svg)](https://www.typescriptlang.org/)
[![Node.js](https://img.shields.io/badge/Node.js-20+-green.svg)](https://nodejs.org/)

---

## Dashboard

![ShieldX Defense Center](docs/screenshots/dashboard-overview.png)

Real-time overview with KPIs, kill chain distribution, and incident feed. Every scan result shows threat level, matched patterns, and the exact defense layer that caught it.

## Live Prompt Tester

![Try It - Threat Detection](docs/screenshots/try-it-scan.png)

Test any prompt against the defense pipeline in real-time. See exactly which rules fired, confidence scores, and kill chain classification.

## Promptware Kill Chain

![Kill Chain Mapping](docs/screenshots/kill-chain.png)

Maps every detected attack to the Schneier 2026 Promptware Kill Chain with 7 phases: Initial Access, Privilege Escalation, Reconnaissance, Persistence, Command & Control, Lateral Movement, Actions on Objective.

---

## Why ShieldX?

| Feature | ShieldX | LLM Guard | Rebuff | NeMo Guardrails |
|---------|---------|-----------|--------|-----------------|
| Kill Chain Mapping | 7 phases | No | No | No |
| Self-Learning | Drift + Active Learning | No | Vector only | No |
| Self-Healing | Per-phase strategies | No | No | No |
| Self-Testing | Red team mutations | No | No | No |
| MCP/Tool Protection | Full guard | No | No | No |
| Compliance | MITRE + OWASP + EU AI Act | No | No | No |
| Local-First | 100% | Partial | Partial | Yes |
| Latency | <2ms (rules) | ~50ms | ~100ms | ~200ms |

## Quick Start

```typescript
import { ShieldX } from '@shieldx/core'

const shield = new ShieldX()
const result = await shield.scanInput('Ignore all previous instructions')

console.log(result.detected)        // true
console.log(result.threatLevel)     // 'critical'
console.log(result.killChainPhase)  // 'initial_access'
console.log(result.action)          // 'block'
console.log(result.latencyMs)       // 0.2
```

## 10-Layer Defense Pipeline

| Layer | Name | Function | Latency |
|-------|------|----------|---------|
| L0 | Preprocessing | Unicode normalization, tokenizer attacks, compressed payloads | <0.5ms |
| L1 | Rule Engine | 72 regex patterns across 7 kill chain phases | <2ms |
| L2 | Sentinel Phrases | Tripwire detection for system prompt probing | <1ms |
| L3 | Constitutional AI | LLM-based classification (optional, via Ollama) | ~100ms |
| L4 | Embeddings | Semantic similarity via Ollama + pgvector | ~200ms |
| L5 | Entropy Analysis | Shannon entropy + attention pattern detection | <1ms |
| L6 | Behavioral | Conversation tracking, intent monitoring, context integrity | <5ms |
| L7 | MCP Guard | Tool privilege checking, chain analysis, resource budgets | <1ms |
| L8 | Sanitization | Input/output cleaning, PPA, credential redaction | <1ms |
| L9 | Self-Consciousness | Meta-reasoning about own vulnerability state | ~50ms |

## The 7-Phase Promptware Kill Chain

1. **Initial Access** - Instruction override, delimiter injection
2. **Privilege Escalation** - Jailbreaks, DAN, role switching
3. **Reconnaissance** - System prompt extraction, scope probing
4. **Persistence** - Memory poisoning, context manipulation
5. **Command & Control** - Fake system messages, dynamic instruction loading
6. **Lateral Movement** - Agent-to-agent spread, external resource access
7. **Actions on Objective** - Data exfiltration, code execution, denial of service

## Self-Evolution Engine

ShieldX doesn't just detect attacks -- it gets smarter from every one:

- **Concept Drift Detection** - CUSUM algorithm detects when attack patterns shift
- **Active Learning** - Uncertain results queued for human review (~6% sample rate)
- **Red Team Engine** - GAN-style mutation generates attack variants to self-test
- **Attack Graph** - Maps technique evolution and relationships
- **Federated Sync** - Opt-in community pattern sharing (privacy-preserving, hash-only)

## Automated Resistance Testing

Built-in scheduled testing runs 31 probes across all 7 kill chain phases:
- 2x daily automated runs (configurable schedule)
- 6 mutation strategies: synonym replacement, case scrambling, whitespace insertion, base64 encoding, leet speak, unicode substitution
- Results tracked in dashboard with trend visualization

## Compliance

- **MITRE ATLAS** - Maps to ML attack techniques
- **OWASP LLM Top 10 2025** - Covers all 10 risk categories
- **EU AI Act** - Articles 9, 12, 14, 15 compliance reporting

## Dashboard Pages

| Page | Description |
|------|-------------|
| Overview | KPIs, kill chain heatmap, incident feed |
| Kill Chain | 7-phase visualization with drill-down |
| Incidents | Filterable incident log with badges |
| Learning | Pattern stats, drift detection, FP rate |
| Compliance | MITRE/OWASP/EU AI Act coverage |
| Healing | Self-healing action log |
| Resistance | Automated defense testing with scheduling |
| Config | Scanner toggles, thresholds |
| Try It | Live prompt tester |

## Integration

### Next.js 15 Middleware

```typescript
import { guardPrompt } from '@shieldx/core/guard'

// In any API route:
const blocked = await guardPrompt(userInput)
if (blocked) return Response.json({ error: blocked }, { status: 400 })
```

### Ollama

```typescript
import { createOllamaClient } from '@shieldx/core/ollama'

const ollama = createOllamaClient({
  endpoint: 'http://localhost:11434',
  model: 'llama3.2',
  shieldx: shield
})
// All calls automatically scanned
```

### n8n

Copy `integrations/n8n-shieldx-node.js` to `~/.n8n/custom/nodes/` and add the ShieldX node before any AI node in your workflow.

## Installation

```bash
npm install @shieldx/core
```

### With PostgreSQL (recommended for production):

```bash
# Start PostgreSQL with pgvector
docker compose up -d

# Run migrations
npm run db:migrate

# Seed initial patterns
npm run db:seed
```

### Without PostgreSQL (in-memory mode):

```typescript
const shield = new ShieldX({
  learning: { storageBackend: 'memory' }
})
```

## Benchmarks

Run with `npm run benchmark`:

```
Total Samples:    324
Attack Samples:   283
Benign Samples:   41

True Positive Rate (TPR):  32.9%  (rule-engine only, no ML)
False Positive Rate (FPR):  2.4%
Latency avg:               0.06ms
Latency p99:               0.33ms
```

*TPR increases significantly when embedding (L4) and behavioral (L6) scanners are enabled with Ollama.*

## Performance Targets

| Metric | Target | Achieved |
|--------|--------|----------|
| L1 Rule Engine | <2ms | 0.06ms |
| Full pipeline (no ML) | <50ms | <2ms |
| Embedding scan | <200ms | Depends on Ollama |
| False Positive Rate | <5% | 2.4% |

## Project Structure

```
shieldx/
  src/
    core/           # ShieldX orchestrator, config, logger
    types/          # TypeScript type definitions
    detection/      # L1-L5 scanners + rules
    preprocessing/  # L0 Unicode, tokenizer, compression
    sanitization/   # L8 input/output cleaning, PPA
    behavioral/     # L6 conversation, intent, context
    mcp-guard/      # L7 tool validation, privilege check
    validation/     # Canary tokens, output validation
    healing/        # Self-healing strategies per phase
    learning/       # Pattern store, drift, active learning
    compliance/     # MITRE ATLAS, OWASP, EU AI Act
    integrations/   # Next.js, Ollama, n8n wrappers
  tests/
    unit/           # 294 unit tests
    attack-corpus/  # 500+ attack samples
  dashboard/        # @shieldx/dashboard React components
  app/              # Standalone Next.js dashboard
  scripts/          # Seed, benchmark, self-test, deploy
```

## Tech Stack

- **TypeScript** strict mode, zero `any`
- **Node.js 20+**
- **PostgreSQL 17** + pgvector for persistent learning
- **Ollama** for local embeddings (nomic-embed-text) and guard model
- **Vitest** for testing
- **tsup** for building
- **Next.js 15** for dashboard

## License

Apache 2.0 - See [LICENSE](LICENSE)

## Context X

ShieldX is a [Context X](https://context-x.org) Open Source project.

*More Engineering, Less Bullshit.*