Rene Fichtmueller
|
8df3dad3c6
|
feat: add typoglycemia detection to TokenizerNormalizer
Detects scrambled-middle-letter attack words (OWASP LLM defense).
Pre-computed signature map for O(1) lookups — "ignroe" → "ignore",
"bypssa" → "bypass", "insrtuctinos" → "instructions".
40 attack keywords covered. Zero false positives on benchmark.
|
2026-04-07 11:35:10 +02:00 |
|
Rene Fichtmueller
|
ca02998a28
|
feat: ShieldX v0.5.0 — full defense evolution + pentest hardening
4-phase defense evolution (Bio-Immune, Adversarial, Ensemble, ATLAS)
with ~200 new detection rules across 20 languages.
TPR 32.9% → 70.8%, FPR 12.2% → 0.0%
New modules: DefenseEnsemble, AtlasTechniqueMapper, EvolutionEngine,
ImmuneMemory, FeverResponse, MELONGuard, AdversarialTrainer,
DecompositionDetector, IndirectInjectionDetector, OutputPayloadGuard,
ToolCallSafetyGuard, AuthContextGuard, ResourceExhaustionDetector,
TokenizerDeobfuscation, Binary/Hex decoder, OverDefenseCalibrator
|
2026-04-07 00:27:12 +02:00 |
|
Rene Fichtmueller
|
04349aed69
|
feat(security): v0.4.0 — three research-driven detection gaps closed
Implements hardening based on sarendis56/Jailbreak_Detection_RCS
(arXiv:2512.12069) and the Awesome-LVLM-Attack/Safety survey series.
L0 — CipherDecoder: FlipAttack, ROT13, Caesar (all 25 shifts), Morse,
Leet speak, Pig Latin, ASCII art detection with suspicion scoring.
L2 — SemanticContrastiveScanner: RCS-style harmful/benign bucket
comparison via EmbeddingStore, 20 canonical jailbreak seeds, BoW
embedding fallback for offline use.
L6 — ConversationTracker: Crescendo (+0.35), Foot-in-the-Door (+0.40),
Jigsaw Puzzle (+0.45) multi-turn escalation patterns added.
292/294 tests passing (2 pre-existing ATLASMapper failures unrelated).
|
2026-04-04 23:04:42 +02:00 |
|
Rene Fichtmueller
|
1c4c034483
|
feat: ShieldX v0.3.0 — UnicodeScanner (L5), DNS Covert Channel rules, ATLAS v5.4 mappings
- Layer 4 EntropyScanner: Shannon entropy, Base32/Base64 detection, CVE-2025-55284
ping/nslookup exfil, EchoLeak markdown pattern, DNS tunneling (iodine/dnscat)
- Layer 5 UnicodeScanner: ASCII Smuggling (U+E0000 Tags Block), Variant Selectors,
Zero-Width steganography, CamoLeak image-ordering (CVE-2025-53773), homoglyphs,
BiDi override, high-entropy URL params
- 30 DNS covert channel rules (dns-001 to dns-030)
- ATLASMapper: 29 techniques (ATLAS v5.4.0 Feb 2026), added AML.T0062 (Agent Tool
Invocation), AML.TA0015 (C2 tactic), memory poisoning, multi-agent trust,
CamoLeak, Unicode steganography mappings
- Rule count: 72 → 102
- Build: tsup 316ms, zero TypeScript errors
|
2026-03-31 16:32:16 +02:00 |
|