Rene Fichtmueller
|
2f6dea9959
|
docs: update changelog with multilingual expansion stats
|
2026-04-07 01:09:54 +02:00 |
|
Rene Fichtmueller
|
9520820364
|
feat: expand multilingual detection to 211 rules across 50+ languages
- TPR improved from 70.8% to 91.9% (324 sample benchmark)
- Multilingual attack TPR: 96.6% (29 samples)
- Deep South Asian coverage: Bengali (9), Hindi (8), Urdu (6), Tamil (4),
Telugu (3), Marathi (4), Gujarati (3), Kannada (2), Malayalam (2),
Punjabi (2), Sinhala (2), Nepali (4), Pan-Indic transliterated (7)
- New languages: Persian, Hebrew, Kurdish, Indonesian, Filipino, Burmese,
Khmer, Lao, Finnish, Czech, Slovak, Romanian, Hungarian, Greek, Bulgarian,
Croatian, Serbian, Georgian, Armenian, Azerbaijani, Swahili, Amharic,
Afrikaans, Mongolian, and 20+ more
- Universal patterns: rapid script switching, global DAN mode, cross-script
password extraction, no-filter patterns
- README updated with new benchmark results and language coverage tables
|
2026-04-07 01:08:09 +02:00 |
|
Rene Fichtmueller
|
7da39fd7d5
|
docs: fix clone URL for public repo
|
2026-04-07 00:36:46 +02:00 |
|
Rene Fichtmueller
|
2e7e11fbce
|
docs: comprehensive v0.5.0 README with full feature documentation
- Architecture diagram updated with all new modules (ensemble, ATLAS, evolution, immune memory)
- Benchmark results section (70.8% TPR, 0.0% FPR)
- Defense modules overview table with line counts
- 369+ detection rules across 12 categories documented
- Bio-immune self-evolution (6 mechanisms) fully explained
- Preprocessing pipeline: CipherDecoder, TokenizerNormalizer, Unicode
- MITRE ATLAS mapping (90 techniques, 8 tactics) with API examples
- MCP Guard with MELON, tool chain, resource governor details
- Decomposition attack detection documentation
- Supply chain integrity section
- Multilingual detection (20+ languages) with examples
- RAG Shield documentation
- Output validation and OutputPayloadGuard docs
- Compliance section (MITRE ATLAS, OWASP LLM Top 10, EU AI Act)
- Full project structure tree
- Updated feature comparison table (30 features vs competitors)
- Updated performance targets with new modules
- Bio-immune API examples (evolution, adversarial training, calibration)
- 1265 lines from 604 — over 2x content increase
|
2026-04-07 00:36:20 +02:00 |
|
Rene Fichtmueller
|
ca02998a28
|
feat: ShieldX v0.5.0 — full defense evolution + pentest hardening
4-phase defense evolution (Bio-Immune, Adversarial, Ensemble, ATLAS)
with ~200 new detection rules across 20 languages.
TPR 32.9% → 70.8%, FPR 12.2% → 0.0%
New modules: DefenseEnsemble, AtlasTechniqueMapper, EvolutionEngine,
ImmuneMemory, FeverResponse, MELONGuard, AdversarialTrainer,
DecompositionDetector, IndirectInjectionDetector, OutputPayloadGuard,
ToolCallSafetyGuard, AuthContextGuard, ResourceExhaustionDetector,
TokenizerDeobfuscation, Binary/Hex decoder, OverDefenseCalibrator
|
2026-04-07 00:27:12 +02:00 |
|
Rene Fichtmueller
|
09eefac095
|
fix: remove local file paths from research reference
|
2026-04-04 23:07:35 +02:00 |
|
Rene Fichtmueller
|
04349aed69
|
feat(security): v0.4.0 — three research-driven detection gaps closed
Implements hardening based on sarendis56/Jailbreak_Detection_RCS
(arXiv:2512.12069) and the Awesome-LVLM-Attack/Safety survey series.
L0 — CipherDecoder: FlipAttack, ROT13, Caesar (all 25 shifts), Morse,
Leet speak, Pig Latin, ASCII art detection with suspicion scoring.
L2 — SemanticContrastiveScanner: RCS-style harmful/benign bucket
comparison via EmbeddingStore, 20 canonical jailbreak seeds, BoW
embedding fallback for offline use.
L6 — ConversationTracker: Crescendo (+0.35), Foot-in-the-Door (+0.40),
Jigsaw Puzzle (+0.45) multi-turn escalation patterns added.
292/294 tests passing (2 pre-existing ATLASMapper failures unrelated).
|
2026-04-04 23:04:42 +02:00 |
|
Rene Fichtmueller
|
a456546aa8
|
feat(rules): mcp-007..010 — Claude Code source map leak countermeasures
Rules based on 2026-03-31 npm source map disclosure:
- mcp-007: Coordinator Mode / KAIROS / ULTRAPLAN invocation attempts
- mcp-008: Multi-agent spawn manipulation via known spawning mechanism
- mcp-009: Persistent memory file targeting (CLAUDE.md / .claude/ injection)
- mcp-010: Tool enumeration probe (reconnaissance of available tools)
Source: github.com/Kuberwastaken/claude-code, @anthropic-ai/claude-code
MITRE ATLAS: AML.T0062, AML.T0051, AML.TA0015 (C2)
|
2026-03-31 16:54:29 +02:00 |
|
Rene Fichtmueller
|
915b6ab285
|
feat(scripts): daily arXiv + HackerNews security monitor
Autonomous monitoring script for Erik VPS:
- Fetches arXiv cs.CR + cs.AI RSS feeds daily
- Fetches HackerNews top stories + keyword RSS feeds
- Classifies relevance via Claude Haiku API (HIGH/MEDIUM/LOW/SKIP)
- HIGH findings: generates TypeScript detection rules via Claude
- Appends rules to src/detection/AutoGeneratedRules.ts
- Runs tsc --noEmit before committing (zero errors required)
- Commits + pushes to Gitea on success
- JSON report saved to /opt/scripts/logs/shieldx-report-YYYY-MM-DD.json
- Cron: 0 6 * * * (6:00 UTC = 8:00 Berlin)
- deploy-monitor-erik.sh: one-command deploy to Erik
|
2026-03-31 16:52:09 +02:00 |
|
Rene Fichtmueller
|
1c4c034483
|
feat: ShieldX v0.3.0 — UnicodeScanner (L5), DNS Covert Channel rules, ATLAS v5.4 mappings
- Layer 4 EntropyScanner: Shannon entropy, Base32/Base64 detection, CVE-2025-55284
ping/nslookup exfil, EchoLeak markdown pattern, DNS tunneling (iodine/dnscat)
- Layer 5 UnicodeScanner: ASCII Smuggling (U+E0000 Tags Block), Variant Selectors,
Zero-Width steganography, CamoLeak image-ordering (CVE-2025-53773), homoglyphs,
BiDi override, high-entropy URL params
- 30 DNS covert channel rules (dns-001 to dns-030)
- ATLASMapper: 29 techniques (ATLAS v5.4.0 Feb 2026), added AML.T0062 (Agent Tool
Invocation), AML.TA0015 (C2 tactic), memory poisoning, multi-agent trust,
CamoLeak, Unicode steganography mappings
- Rule count: 72 → 102
- Build: tsup 316ms, zero TypeScript errors
|
2026-03-31 16:32:16 +02:00 |
|