Rene Fichtmueller 04349aed69 feat(security): v0.4.0 — three research-driven detection gaps closed
Implements hardening based on sarendis56/Jailbreak_Detection_RCS
(arXiv:2512.12069) and the Awesome-LVLM-Attack/Safety survey series.

L0 — CipherDecoder: FlipAttack, ROT13, Caesar (all 25 shifts), Morse,
Leet speak, Pig Latin, ASCII art detection with suspicion scoring.

L2 — SemanticContrastiveScanner: RCS-style harmful/benign bucket
comparison via EmbeddingStore, 20 canonical jailbreak seeds, BoW
embedding fallback for offline use.

L6 — ConversationTracker: Crescendo (+0.35), Foot-in-the-Door (+0.40),
Jigsaw Puzzle (+0.45) multi-turn escalation patterns added.

292/294 tests passing (2 pre-existing ATLASMapper failures unrelated).
2026-04-04 23:04:42 +02:00
..