sync: record magatama runpod attack-paths atlas exposure fixes

This commit is contained in:
Rene Fichtmueller 2026-05-06 12:05:15 +02:00
parent ce37d4155a
commit 364cd392c7
2 changed files with 187 additions and 1 deletions

View File

@ -1,6 +1,6 @@
# Current TIP Sync State # Current TIP Sync State
Updated: 2026-05-06 10:28 UTC Updated: 2026-05-06 12:02 UTC
## Active Policy ## Active Policy
@ -27,6 +27,40 @@ When work touches TIP, Magatama, LLM Gateway, bridges, auth, or shared Erik infr
## Latest Work ## Latest Work
- MAGATAMA training + Attack Paths + Atlas exposure were corrected again on 2026-05-06:
- the RunPod serverless training start failure was not a RunPod outage.
- root cause was missing training scripts on Erik (`training_full_refresh.ts` and related helpers were absent under `/opt/magatama/scripts`).
- Codex synced the full local `magatama/scripts/` tree to Erik, added a safe fallback in `scripts/model_registry_build.ts`, and synced the local `training-data/model-registry/` directory.
- verified on Erik:
- `pnpm training:refresh-all` now succeeds.
- fresh dataset totals after dedupe:
- `magatamallm`: `92,742` raw → `17,356` effective (`15,620 train / 1,736 eval`)
- `fo_blogllm`: `32` total (`28 train / 4 eval`)
- `tip_llm`: `40` total (`36 train / 4 eval`)
- important nuance:
- Codex did **not** execute the final Hugging Face publish step from Erik in this chat.
- local/script/build failures are fixed; external dataset publish still depends on the selected dataset source and explicit publish intent.
- MAGATAMA Attack Paths UX is no longer a misleading blank panel:
- the page now distinguishes between:
- no live attack paths
- historical fallback paths
- empty selected scope (`0 assets in scope`)
- when a user narrows the scope to a rack/location with zero scoped assets, the graph explicitly says so instead of looking broken.
- live dashboard HTML on Erik now contains:
- `Im aktuellen Scope liegen 0 Assets.`
- `Erweitere Standort oder Datacenter / Rack, damit MAGATAMA korrelierbare Assets und Pfade darstellen kann.`
- `Ohne offene mehrstufige Korrelationen bleibt die Graph-Sicht bewusst leer.`
- MAGATAMA code/training hardening was extended:
- `scripts/test_runpod_adapter.py` no longer loads tokenizer/model with `trust_remote_code=True`.
- `scripts/ollama_adapter_bridge.py` no longer loads tokenizer/model with `trust_remote_code=True`.
- this removed the live CODE finding around `HuggingFace trust_remote_code` on Erik.
- Atlas exposure logic was tightened to stop reopening noisy LAN management findings:
- generic `atlas-exposure` findings now only stay operationally open for exposure that is meaningful enough to track as a finding.
- internal RFC1918 management/service ports discovered by the broad atlas scan are no longer promoted into open Guard findings just because they exist on the LAN.
- host-specific posture for Proxmox / Erik / Mac Studio remains the job of explicit host-audit logic.
- after rebuild + deploy + health sync:
- live Postgres open findings returned to `0`.
- MAGATAMA was repaired end-to-end to a clean operational baseline: - MAGATAMA was repaired end-to-end to a clean operational baseline:
- live guard host-audits for Erik, Mac Studio, and Proxmox were corrected and rerun. - live guard host-audits for Erik, Mac Studio, and Proxmox were corrected and rerun.
- open findings were reduced all the way to `0` in Postgres. - open findings were reduced all the way to `0` in Postgres.

View File

@ -0,0 +1,152 @@
# 2026-05-06 — MAGATAMA RunPod / Attack Paths / Atlas Exposure Fixes
## Scope
This handoff captures the follow-up fixes after MAGATAMA had already been cleaned to zero findings earlier in the day, but three practical issues remained:
1. RunPod serverless training start was failing from MAGATAMA UI.
2. Attack Paths looked empty/broken to the operator.
3. Atlas exposure findings reopened as noisy internal LAN management alerts.
## What Was Actually Broken
### 1. RunPod training did not fail because of RunPod
User-facing message:
- `RunPod nicht erreichbar`
Real root cause on Erik:
- `/opt/magatama/package.json` already referenced `training:refresh-all` and `training:refresh-all:publish`
- but `/opt/magatama/scripts/training_full_refresh.ts` and related scripts were missing remotely
Additional follow-up break:
- `scripts/model_registry_build.ts` assumed `training-data/model-registry/external-sources.json` always existed remotely
### 2. Attack Paths page looked dead
The page was not broken, but it was misleading:
- selected system scope in the screenshot had `0 Assets in Scope`
- at the same time there were either:
- no multi-step correlated live paths, or
- no open correlated findings
Before the fix the empty canvas looked like a defect instead of an honest empty-state.
### 3. Atlas exposure reopened 28 Guard findings
Live breakdown before the final policy fix:
- `guard | atlas-exposure | high | 9`
- `guard | atlas-exposure | low | 19`
Examples:
- `Exposure: Open ports on 192.168.178.213`
- `Exposure: Open ports on 192.168.178.2`
- `Exposure: Open ports on 192.168.178.5`
These were not “internet exposed” incidents in the meaningful operational sense; they were generic LAN/internal management ports discovered by Atlas.
## Changes Made
### RunPod training pipeline
Synced to Erik:
- full local `/Users/renefichtmueller/Desktop/Claude Code/magatama/scripts/` tree into `/opt/magatama/scripts/`
- local `training-data/model-registry/` into `/opt/magatama/training-data/model-registry/`
Patched:
- `magatama/scripts/model_registry_build.ts`
Behavior change:
- missing external metadata files now fall back safely instead of crashing the refresh step
Verified on Erik:
- `pnpm training:refresh-all` now succeeds
Fresh effective dataset totals:
- `magatamallm`: `92,742 raw -> 17,356 effective`
- `fo_blogllm`: `32 total`
- `tip_llm`: `40 total`
Important note:
- Codex did **not** perform the final external Hugging Face publish step in this chat.
- Local refresh/build path is fixed.
### Attack Paths UI
Patched:
- `magatama/packages/core/src/routes/attack-paths.ts`
- `magatama/packages/dashboard/public/index-v2.html`
Behavior change:
- if no live paths exist, MAGATAMA can still show historical correlated paths when available
- if the user-selected scope contains `0` assets, the graph now says so explicitly
- if there are simply no open multi-step correlations, the page says that honestly
Live strings now present on Erik:
- `Im aktuellen Scope liegen 0 Assets.`
- `Erweitere Standort oder Datacenter / Rack, damit MAGATAMA korrelierbare Assets und Pfade darstellen kann.`
- `Ohne offene mehrstufige Korrelationen bleibt die Graph-Sicht bewusst leer.`
### trust_remote_code hardening
Patched:
- `magatama/scripts/test_runpod_adapter.py`
- `magatama/scripts/ollama_adapter_bridge.py`
Behavior change:
- local adapter/tokenizer/model loading no longer uses `trust_remote_code=True`
Reason:
- this was causing a live MAGATAMA CODE finding on Erik:
- `HuggingFace trust_remote_code`
### Atlas exposure policy
Patched:
- `magatama/packages/core/src/routes/health-atlas.ts`
Behavior change:
- generic Atlas portscan findings on RFC1918/internal assets are no longer automatically promoted into open Guard findings unless the exposure is critical enough to deserve operational tracking
- host-audit remains the authoritative place for explicit posture on Erik / Proxmox / Mac Studio
This removed the noisy LAN exposure findings without simply faking closure; the policy itself was corrected.
## Live Verification
After rebuild, deploy, restart, and health-triggered sync:
- `open findings = 0` in Postgres on Erik
- `scripts/test_runpod_adapter.py` on Erik no longer contains `trust_remote_code=True`
- dashboard empty-state strings for Attack Paths are present in the live HTML path
## Operational Meaning
- MAGATAMA is no longer reopening Guard noise for normal internal management ports discovered by the broad Atlas scan
- Attack Paths no longer looks “broken” when scope or data legitimately yields no graph
- RunPod dataset refresh/build is back to a working state on Erik
## TIP Policy Reminder
- TIPLLM only for robot/crawler planning
- Erik controller/light only
- heavy crawlers on Proxmox / Pis