shieldx/docs/threat-model.md
Rene Fichtmueller 1c4c034483 feat: ShieldX v0.3.0 — UnicodeScanner (L5), DNS Covert Channel rules, ATLAS v5.4 mappings
- Layer 4 EntropyScanner: Shannon entropy, Base32/Base64 detection, CVE-2025-55284
  ping/nslookup exfil, EchoLeak markdown pattern, DNS tunneling (iodine/dnscat)
- Layer 5 UnicodeScanner: ASCII Smuggling (U+E0000 Tags Block), Variant Selectors,
  Zero-Width steganography, CamoLeak image-ordering (CVE-2025-53773), homoglyphs,
  BiDi override, high-entropy URL params
- 30 DNS covert channel rules (dns-001 to dns-030)
- ATLASMapper: 29 techniques (ATLAS v5.4.0 Feb 2026), added AML.T0062 (Agent Tool
  Invocation), AML.TA0015 (C2 tactic), memory poisoning, multi-agent trust,
  CamoLeak, Unicode steganography mappings
- Rule count: 72 → 102
- Build: tsup 316ms, zero TypeScript errors
2026-03-31 16:32:16 +02:00

11 KiB

Threat Model

Overview

This document maps the threat landscape for LLM-integrated applications to the MITRE ATLAS (Adversarial Threat Landscape for Artificial Intelligence Systems) framework and shows where ShieldX provides coverage.

MITRE ATLAS Technique Coverage

Reconnaissance (ATLAS Tactic: TA0001)

ATLAS Technique ID ShieldX Coverage Detecting Layer
Discover ML Model Ontology AML.T0001 Covered L9: Leakage Detector, Canary Manager
Discover ML Model Family AML.T0002 Covered L1: Rule Engine (model probing patterns)
Discover ML Capabilities AML.T0014 Covered L6: Session Profiler (probing behavior detection)
Search for Victim's Publicly Available ML Artifacts AML.T0003 Out of scope N/A (external to the application)

Resource Development (ATLAS Tactic: TA0002)

ATLAS Technique ID ShieldX Coverage Detecting Layer
Acquire Public ML Artifacts AML.T0004 Out of scope N/A (attacker preparation, external)
Develop Adversarial ML Attacks AML.T0005 Proactive defense Red Team Engine generates variants
Publish Poisoned Datasets AML.T0019 Covered L9: RAG Shield (document integrity scoring)

Initial Access (ATLAS Tactic: TA0003)

ATLAS Technique ID ShieldX Coverage Detecting Layer
Prompt Injection (Direct) AML.T0051 Covered L0: Preprocessing, L1: Rule Engine, L3: Embedding, L4: Entropy
Prompt Injection (Indirect) AML.T0051.001 Covered Indirect Scanner, L7: Tool Poison Detector, L9: RAG Shield
Phishing / Social Engineering AML.T0052 Partial L6: Intent Monitor (detects manipulation patterns)
Supply Chain Compromise of ML Model AML.T0010 Covered Supply Chain Verifier, Model Provenance Checker

ML Model Access (ATLAS Tactic: TA0004)

ATLAS Technique ID ShieldX Coverage Detecting Layer
Inference API Access AML.T0040 Out of scope N/A (access control, external to ShieldX)
Full ML Model Access AML.T0041 Out of scope N/A (access control, external to ShieldX)
ML Artifact Collection AML.T0035 Covered L9: Leakage Detector (detects model weight/config extraction)

Execution (ATLAS Tactic: TA0005)

ATLAS Technique ID ShieldX Coverage Detecting Layer
LLM Prompt Injection AML.T0051 Covered Full pipeline (L0-L9)
Arbitrary Code via ML Model AML.T0053 Covered L7: MCP Guard (tool call validation)
User Execution of Malicious Content AML.T0054 Covered L8: Output Sanitizer, L9: Output Validator

Persistence (ATLAS Tactic: TA0006)

ATLAS Technique ID ShieldX Coverage Detecting Layer
Poisoning of Training Data AML.T0020 Out of scope N/A (training pipeline, external)
Backdoor ML Model AML.T0018 Covered Supply Chain Verifier, Model Provenance Checker
Modify ML Model Configuration AML.T0024 Covered L6: Memory Integrity Guard, Context Integrity
Modify ML Pipeline Component AML.T0023 Partial L7: Manifest Verifier (MCP server manifests)

Privilege Escalation (ATLAS Tactic: TA0007)

ATLAS Technique ID ShieldX Coverage Detecting Layer
LLM Jailbreak AML.T0054 Covered L1: Rule Engine, L2: Sentinel, L6: Intent Monitor
System Prompt Override AML.T0055 Covered L1: Rule patterns, L6: Context Integrity, L9: Role Integrity Checker

Defense Evasion (ATLAS Tactic: TA0008)

ATLAS Technique ID ShieldX Coverage Detecting Layer
Adversarial Example in Inference AML.T0043 Covered L3: Embedding Anomaly, L4: Entropy, L5: Attention
Evade ML Model AML.T0015 Covered Red Team Engine (proactive gap discovery), L3: Embedding
Input Obfuscation AML.T0016 Covered L0: Unicode Normalizer, Tokenizer Normalizer, Compressed Payload Detector
Encoding-Based Evasion AML.T0058 Covered L0: Compressed Payload Detector, L4: Entropy Scanner

Credential Access (ATLAS Tactic: TA0009)

ATLAS Technique ID ShieldX Coverage Detecting Layer
Extract Credentials via LLM AML.T0056 Covered L8: Credential Redactor, L9: Leakage Detector
Extract API Keys via Tool Calls AML.T0057 Covered L7: Tool Chain Guard, L8: Credential Redactor

Discovery (ATLAS Tactic: TA0010)

ATLAS Technique ID ShieldX Coverage Detecting Layer
Discover ML Model Output AML.T0044 Covered L6: Session Profiler (probing detection)
Extract System Prompt AML.T0059 Covered L9: Canary Manager, Leakage Detector
Enumerate Available Tools AML.T0060 Covered L7: Privilege Checker (denies unauthorized tool listing)

Lateral Movement (ATLAS Tactic: TA0011)

ATLAS Technique ID ShieldX Coverage Detecting Layer
Cross-Agent Injection AML.T0061 Covered L7: Tool Chain Guard, Tool Poison Detector
Exploit MCP Tool Chain AML.T0062 Covered L7: Full MCP Guard suite
Data Store Poisoning AML.T0063 Covered L9: RAG Shield (document integrity)

Collection (ATLAS Tactic: TA0012)

ATLAS Technique ID ShieldX Coverage Detecting Layer
Exfiltrate Training Data AML.T0025 Out of scope N/A (training pipeline)
Exfiltrate ML Model AML.T0026 Covered L9: Leakage Detector
Harvest Credentials from Output AML.T0064 Covered L8: Credential Redactor

Exfiltration (ATLAS Tactic: TA0013)

ATLAS Technique ID ShieldX Coverage Detecting Layer
Data Exfiltration via LLM Output AML.T0065 Covered L8: Output Sanitizer, Credential Redactor
Data Exfiltration via Tool Calls AML.T0066 Covered L7: Tool Chain Guard, Resource Governor
Side-Channel Exfiltration AML.T0067 Partial L4: Entropy (detects encoded data in output)

Impact (ATLAS Tactic: TA0014)

ATLAS Technique ID ShieldX Coverage Detecting Layer
Denial of ML Service AML.T0029 Covered L7: Resource Governor (budget enforcement)
ML Model Integrity Violation AML.T0028 Covered L6: Context Integrity, Memory Integrity Guard
Harm to Downstream Task AML.T0048 Covered L9: Scope Validator, Output Validator

OWASP LLM Top 10 (2025) Coverage

# Risk OWASP ID ShieldX Coverage Primary Layers
1 Prompt Injection LLM01 Full coverage L0-L5, L8 (input), L9 (output)
2 Insecure Output Handling LLM02 Full coverage L8: Output Sanitizer, Credential Redactor
3 Training Data Poisoning LLM03 Partial (RAG documents only) L9: RAG Shield
4 Model Denial of Service LLM04 Covered L7: Resource Governor
5 Supply Chain Vulnerabilities LLM05 Covered Supply Chain Verifier, MCP Manifest Verifier
6 Sensitive Information Disclosure LLM06 Full coverage L8: Credential Redactor, L9: Leakage Detector, Canary Manager
7 Insecure Plugin Design LLM07 Full coverage L7: Full MCP Guard suite
8 Excessive Agency LLM08 Full coverage L7: Privilege Checker, Resource Governor, Tool Chain Guard
9 Overreliance LLM09 Partial L9: Output Validator (factual scope checking)
10 Model Theft LLM10 Out of scope N/A (infrastructure security)

Coverage Summary

By ATLAS Tactic

Tactic Total Techniques Covered Partial Out of Scope
Reconnaissance 4 3 0 1
Resource Development 3 1 0 2
Initial Access 4 3 1 0
ML Model Access 3 1 0 2
Execution 3 3 0 0
Persistence 4 2 1 1
Privilege Escalation 2 2 0 0
Defense Evasion 4 4 0 0
Credential Access 2 2 0 0
Discovery 3 3 0 0
Lateral Movement 3 3 0 0
Collection 3 2 0 1
Exfiltration 3 2 1 0
Impact 3 3 0 0
Total 44 34 3 7

Coverage rate: 77% full + 7% partial = 84% total

By OWASP LLM Top 10

Coverage Level Count Risks
Full coverage 6 LLM01, LLM02, LLM06, LLM07, LLM08, LLM04
Partial coverage 3 LLM03, LLM05, LLM09
Out of scope 1 LLM10

Coverage rate: 60% full + 30% partial = 90% total


Out of Scope

ShieldX is a runtime defense library. The following are explicitly out of scope:

Area Reason Recommended Solution
Model training pipeline security ShieldX operates at inference time ML pipeline security tools (e.g., TensorFlow Model Analysis)
Infrastructure access control ShieldX is an application-layer library IAM, RBAC, network security
Model theft prevention Requires infrastructure-level controls API rate limiting, model encryption, access logging
Physical security Out of software scope Physical security measures
Social engineering (non-prompt) Human factor, outside LLM context Security awareness training

Threat Actor Profiles

Casual Attacker

  • Sophistication: Low
  • Typical techniques: Copy-paste jailbreaks, known DAN prompts, simple role override
  • Kill chain progression: Usually stops at initial access or privilege escalation
  • ShieldX detection rate: >95% (L1 rule engine catches most known patterns)

Skilled Researcher

  • Sophistication: Medium
  • Typical techniques: Novel prompt construction, encoding tricks, multi-turn escalation, attention manipulation
  • Kill chain progression: May reach reconnaissance or persistence
  • ShieldX detection rate: >85% (L3 embedding + L6 behavioral catches paraphrased variants)

Advanced Persistent Threat

  • Sophistication: High
  • Typical techniques: Custom adversarial examples, supply chain poisoning, indirect injection via trusted documents, tool chain exploitation
  • Kill chain progression: Full chain from initial access to actions on objective
  • ShieldX detection rate: >70% (multi-layer defense with red team-evolved patterns)
  • Improvement path: Red Team Engine continuously generates adversarial variants; federated sync shares patterns across deployments

Automated Attack Tools

  • Sophistication: Variable (tool-dependent)
  • Typical techniques: Brute-force prompt mutation, automated jailbreak testing, fuzzing
  • Kill chain progression: Typically initial access with high volume
  • ShieldX detection rate: >90% (volume-based anomaly detection + rate limiting via Resource Governor)