Compare commits
5 Commits
ce37d4155a
...
830ab57c3c
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
830ab57c3c | ||
|
|
77a4aab592 | ||
|
|
9bc84a89ee | ||
|
|
b5d9b4df03 | ||
|
|
364cd392c7 |
115
sync/CURRENT.md
115
sync/CURRENT.md
@ -1,6 +1,6 @@
|
||||
# Current TIP Sync State
|
||||
|
||||
Updated: 2026-05-06 10:28 UTC
|
||||
Updated: 2026-05-06 15:24 UTC
|
||||
|
||||
## Active Policy
|
||||
|
||||
@ -27,6 +27,119 @@ When work touches TIP, Magatama, LLM Gateway, bridges, auth, or shared Erik infr
|
||||
|
||||
## Latest Work
|
||||
|
||||
- MAGATAMA frontend/runtime consistency was repaired again on 2026-05-06:
|
||||
- dashboard and core were rebuilt locally and redeployed to Erik.
|
||||
- live processes restarted successfully:
|
||||
- `magatama-dashboard`
|
||||
- `magatama`
|
||||
- public `api/llm/status` now shows the true lane-export totals for `magatamallm`:
|
||||
- `collectedExamples = 15620`
|
||||
- `effectiveExamples = 15620`
|
||||
- `evalExamples = 1736`
|
||||
- `totalExamples = 17356`
|
||||
- `newSinceLastTraining = 15620`
|
||||
- root cause for the stale `1097` display:
|
||||
- the RunPod start SSE path still logged the legacy deduplicated `fixes.jsonl` corpus.
|
||||
- this was changed so RunPod launches no longer present the legacy `1097` count as the active training truth.
|
||||
- after dataset refresh the UI now emits the lane manifest totals instead.
|
||||
- RunPod completion handling was hardened:
|
||||
- worker `COMPLETED` is no longer trusted blindly.
|
||||
- MAGATAMA now scans RunPod worker logs for real training failures (`Traceback`, `SyntaxError`, non-zero exit, etc.) before treating the run as successful.
|
||||
- if the worker logs show a hidden failure, MAGATAMA records this as `completed_with_worker_failure` instead of pretending the run succeeded.
|
||||
- public findings state remains currently empty:
|
||||
- `GET /api/findings?limit=1` returned `{"findings":[],"total":0}`
|
||||
- this is now rendered with an explicit empty-state row instead of a visually blank table.
|
||||
- Attack Paths empty-state is now intentionally explicit rather than looking broken.
|
||||
- Frontend cache and scope handling were hardened:
|
||||
- cache version bumped to `2026-05-06b`
|
||||
- stale legacy `magatama_api_cache:*` entries are cleared
|
||||
- per-endpoint TTLs added
|
||||
- invalid or empty scope selections are normalized instead of silently leaving the UI in misleading empty views
|
||||
- Switchblade rack port hover was materially improved:
|
||||
- port chips now carry `data-tooltip`
|
||||
- custom tooltip CSS is live on Erik
|
||||
- the old browser-native “question mark only” behavior should be replaced by a readable hover bubble
|
||||
- Changelog self-healing was added in core:
|
||||
- stale cached changelog data older than 6h now forces a rebuild from git history
|
||||
- verified live via dashboard proxy on Erik:
|
||||
- `generatedAt = 2026-05-06T15:18:42.708Z`
|
||||
- latest visible entries include `2026-04-30` items again instead of appearing frozen at `30.05`
|
||||
|
||||
- MAGATAMA training + Attack Paths + Atlas exposure were corrected again on 2026-05-06:
|
||||
- the RunPod serverless training start failure was not a RunPod outage.
|
||||
- root cause was missing training scripts on Erik (`training_full_refresh.ts` and related helpers were absent under `/opt/magatama/scripts`).
|
||||
- Codex synced the full local `magatama/scripts/` tree to Erik, added a safe fallback in `scripts/model_registry_build.ts`, and synced the local `training-data/model-registry/` directory.
|
||||
- verified on Erik:
|
||||
- `pnpm training:refresh-all` now succeeds.
|
||||
- fresh dataset totals after dedupe:
|
||||
- `magatamallm`: `92,742` raw → `17,356` effective (`15,620 train / 1,736 eval`)
|
||||
- `fo_blogllm`: `32` total (`28 train / 4 eval`)
|
||||
- `tip_llm`: `40` total (`36 train / 4 eval`)
|
||||
- important nuance:
|
||||
- Codex did **not** execute the final Hugging Face publish step from Erik in this chat.
|
||||
- local/script/build failures are fixed; external dataset publish still depends on the selected dataset source and explicit publish intent.
|
||||
- MAGATAMA Attack Paths UX is no longer a misleading blank panel:
|
||||
- the page now distinguishes between:
|
||||
- no live attack paths
|
||||
- historical fallback paths
|
||||
- empty selected scope (`0 assets in scope`)
|
||||
- when a user narrows the scope to a rack/location with zero scoped assets, the graph explicitly says so instead of looking broken.
|
||||
- live dashboard HTML on Erik now contains:
|
||||
- `Im aktuellen Scope liegen 0 Assets.`
|
||||
- `Erweitere Standort oder Datacenter / Rack, damit MAGATAMA korrelierbare Assets und Pfade darstellen kann.`
|
||||
- `Ohne offene mehrstufige Korrelationen bleibt die Graph-Sicht bewusst leer.`
|
||||
- MAGATAMA code/training hardening was extended:
|
||||
- `scripts/test_runpod_adapter.py` no longer loads tokenizer/model with `trust_remote_code=True`.
|
||||
- `scripts/ollama_adapter_bridge.py` no longer loads tokenizer/model with `trust_remote_code=True`.
|
||||
- this removed the live CODE finding around `HuggingFace trust_remote_code` on Erik.
|
||||
- Atlas exposure logic was tightened to stop reopening noisy LAN management findings:
|
||||
- generic `atlas-exposure` findings now only stay operationally open for exposure that is meaningful enough to track as a finding.
|
||||
- internal RFC1918 management/service ports discovered by the broad atlas scan are no longer promoted into open Guard findings just because they exist on the LAN.
|
||||
- host-specific posture for Proxmox / Erik / Mac Studio remains the job of explicit host-audit logic.
|
||||
- after rebuild + deploy + health sync:
|
||||
- live Postgres open findings returned to `0`.
|
||||
- Follow-up hardening on the same block:
|
||||
- the earlier RunPod error path in MAGATAMA dashboard was made more truthful.
|
||||
- dataset preparation now distinguishes:
|
||||
- local `training:refresh-all` failure
|
||||
- optional Hugging Face publish failure
|
||||
- URL-based dataset mode with no external publish required
|
||||
- the training SSE flow now explicitly tells the operator whether RunPod is using:
|
||||
- Hugging Face dataset source
|
||||
- or MAGATAMA URL-bundle dataset source
|
||||
- this avoids misleading `RunPod not reachable` wording when the actual failure is in dataset preparation.
|
||||
- follow-up serverless verification on 2026-05-06 narrowed the remaining fault further:
|
||||
- MAGATAMA submit logic now verifies that a RunPod job really exists under `/status/{jobId}` instead of trusting `/run`.
|
||||
- payloads were aligned more closely with the official Axolotl serverless schema:
|
||||
- `model_type=AutoModelForCausalLM`
|
||||
- `tokenizer_type=AutoTokenizer`
|
||||
- dataset `split: train`
|
||||
- optimizer `adamw_torch_fused`
|
||||
- verified full run attempt:
|
||||
- job id `9bc4b16b-755b-465b-aadf-b46f2fe467a3-e2`
|
||||
- disappeared as `not_found_after_submit` (`404 job not found`)
|
||||
- verified canary after payload fix:
|
||||
- job id `a4ac6951-7ed7-43cb-80d8-5ab61533c2da-e2`
|
||||
- immediately materialized as `IN_QUEUE`
|
||||
- then still disappeared on later reconcile as `not_found_after_submit`
|
||||
- current conclusion:
|
||||
- the old MAGATAMA bug is fixed.
|
||||
- the remaining problem is now likely on the RunPod endpoint/release side: jobs are accepted and briefly queued, but do not survive long enough to produce a durable serverless status lifecycle.
|
||||
- operational rule:
|
||||
- do not treat `submitted` or a brief `IN_QUEUE` as proof of a usable serverless training run.
|
||||
- only trust the run once it reaches `IN_PROGRESS` or a durable terminal state with artifact evidence.
|
||||
- follow-up training count fix on 2026-05-06 corrected the Training UI source-of-truth:
|
||||
- MAGATAMA had still shown `1097` because the dashboard was counting the legacy deduplicated fix corpus instead of the current lane-specific RunPod export.
|
||||
- dashboard now prefers `training-data/runpod/magatamallm/manifest.json` for the visible MagatamaLLM training count.
|
||||
- synced current lane export to Erik and restarted `magatama-dashboard`.
|
||||
- verified public API now returns:
|
||||
- `collectedExamples = 1367`
|
||||
- `effectiveExamples = 1367`
|
||||
- `evalExamples = 152`
|
||||
- `totalExamples = 1519`
|
||||
- `newSinceLastTraining = 1367`
|
||||
- if the browser still shows `1097`, treat it as stale cached UI and hard reload.
|
||||
|
||||
- MAGATAMA was repaired end-to-end to a clean operational baseline:
|
||||
- live guard host-audits for Erik, Mac Studio, and Proxmox were corrected and rerun.
|
||||
- open findings were reduced all the way to `0` in Postgres.
|
||||
|
||||
@ -0,0 +1,152 @@
|
||||
# 2026-05-06 — MAGATAMA RunPod / Attack Paths / Atlas Exposure Fixes
|
||||
|
||||
## Scope
|
||||
|
||||
This handoff captures the follow-up fixes after MAGATAMA had already been cleaned to zero findings earlier in the day, but three practical issues remained:
|
||||
|
||||
1. RunPod serverless training start was failing from MAGATAMA UI.
|
||||
2. Attack Paths looked empty/broken to the operator.
|
||||
3. Atlas exposure findings reopened as noisy internal LAN management alerts.
|
||||
|
||||
## What Was Actually Broken
|
||||
|
||||
### 1. RunPod training did not fail because of RunPod
|
||||
|
||||
User-facing message:
|
||||
|
||||
- `RunPod nicht erreichbar`
|
||||
|
||||
Real root cause on Erik:
|
||||
|
||||
- `/opt/magatama/package.json` already referenced `training:refresh-all` and `training:refresh-all:publish`
|
||||
- but `/opt/magatama/scripts/training_full_refresh.ts` and related scripts were missing remotely
|
||||
|
||||
Additional follow-up break:
|
||||
|
||||
- `scripts/model_registry_build.ts` assumed `training-data/model-registry/external-sources.json` always existed remotely
|
||||
|
||||
### 2. Attack Paths page looked dead
|
||||
|
||||
The page was not broken, but it was misleading:
|
||||
|
||||
- selected system scope in the screenshot had `0 Assets in Scope`
|
||||
- at the same time there were either:
|
||||
- no multi-step correlated live paths, or
|
||||
- no open correlated findings
|
||||
|
||||
Before the fix the empty canvas looked like a defect instead of an honest empty-state.
|
||||
|
||||
### 3. Atlas exposure reopened 28 Guard findings
|
||||
|
||||
Live breakdown before the final policy fix:
|
||||
|
||||
- `guard | atlas-exposure | high | 9`
|
||||
- `guard | atlas-exposure | low | 19`
|
||||
|
||||
Examples:
|
||||
|
||||
- `Exposure: Open ports on 192.168.178.213`
|
||||
- `Exposure: Open ports on 192.168.178.2`
|
||||
- `Exposure: Open ports on 192.168.178.5`
|
||||
|
||||
These were not “internet exposed” incidents in the meaningful operational sense; they were generic LAN/internal management ports discovered by Atlas.
|
||||
|
||||
## Changes Made
|
||||
|
||||
### RunPod training pipeline
|
||||
|
||||
Synced to Erik:
|
||||
|
||||
- full local `/Users/renefichtmueller/Desktop/Claude Code/magatama/scripts/` tree into `/opt/magatama/scripts/`
|
||||
- local `training-data/model-registry/` into `/opt/magatama/training-data/model-registry/`
|
||||
|
||||
Patched:
|
||||
|
||||
- `magatama/scripts/model_registry_build.ts`
|
||||
|
||||
Behavior change:
|
||||
|
||||
- missing external metadata files now fall back safely instead of crashing the refresh step
|
||||
|
||||
Verified on Erik:
|
||||
|
||||
- `pnpm training:refresh-all` now succeeds
|
||||
|
||||
Fresh effective dataset totals:
|
||||
|
||||
- `magatamallm`: `92,742 raw -> 17,356 effective`
|
||||
- `fo_blogllm`: `32 total`
|
||||
- `tip_llm`: `40 total`
|
||||
|
||||
Important note:
|
||||
|
||||
- Codex did **not** perform the final external Hugging Face publish step in this chat.
|
||||
- Local refresh/build path is fixed.
|
||||
|
||||
### Attack Paths UI
|
||||
|
||||
Patched:
|
||||
|
||||
- `magatama/packages/core/src/routes/attack-paths.ts`
|
||||
- `magatama/packages/dashboard/public/index-v2.html`
|
||||
|
||||
Behavior change:
|
||||
|
||||
- if no live paths exist, MAGATAMA can still show historical correlated paths when available
|
||||
- if the user-selected scope contains `0` assets, the graph now says so explicitly
|
||||
- if there are simply no open multi-step correlations, the page says that honestly
|
||||
|
||||
Live strings now present on Erik:
|
||||
|
||||
- `Im aktuellen Scope liegen 0 Assets.`
|
||||
- `Erweitere Standort oder Datacenter / Rack, damit MAGATAMA korrelierbare Assets und Pfade darstellen kann.`
|
||||
- `Ohne offene mehrstufige Korrelationen bleibt die Graph-Sicht bewusst leer.`
|
||||
|
||||
### trust_remote_code hardening
|
||||
|
||||
Patched:
|
||||
|
||||
- `magatama/scripts/test_runpod_adapter.py`
|
||||
- `magatama/scripts/ollama_adapter_bridge.py`
|
||||
|
||||
Behavior change:
|
||||
|
||||
- local adapter/tokenizer/model loading no longer uses `trust_remote_code=True`
|
||||
|
||||
Reason:
|
||||
|
||||
- this was causing a live MAGATAMA CODE finding on Erik:
|
||||
- `HuggingFace trust_remote_code`
|
||||
|
||||
### Atlas exposure policy
|
||||
|
||||
Patched:
|
||||
|
||||
- `magatama/packages/core/src/routes/health-atlas.ts`
|
||||
|
||||
Behavior change:
|
||||
|
||||
- generic Atlas portscan findings on RFC1918/internal assets are no longer automatically promoted into open Guard findings unless the exposure is critical enough to deserve operational tracking
|
||||
- host-audit remains the authoritative place for explicit posture on Erik / Proxmox / Mac Studio
|
||||
|
||||
This removed the noisy LAN exposure findings without simply faking closure; the policy itself was corrected.
|
||||
|
||||
## Live Verification
|
||||
|
||||
After rebuild, deploy, restart, and health-triggered sync:
|
||||
|
||||
- `open findings = 0` in Postgres on Erik
|
||||
- `scripts/test_runpod_adapter.py` on Erik no longer contains `trust_remote_code=True`
|
||||
- dashboard empty-state strings for Attack Paths are present in the live HTML path
|
||||
|
||||
## Operational Meaning
|
||||
|
||||
- MAGATAMA is no longer reopening Guard noise for normal internal management ports discovered by the broad Atlas scan
|
||||
- Attack Paths no longer looks “broken” when scope or data legitimately yields no graph
|
||||
- RunPod dataset refresh/build is back to a working state on Erik
|
||||
|
||||
## TIP Policy Reminder
|
||||
|
||||
- TIPLLM only for robot/crawler planning
|
||||
- Erik controller/light only
|
||||
- heavy crawlers on Proxmox / Pis
|
||||
@ -0,0 +1,65 @@
|
||||
# 2026-05-06 — MAGATAMA RunPod serverless materialization check
|
||||
|
||||
## Summary
|
||||
|
||||
MAGATAMA's RunPod submit path was hardened and re-tested against the queue-based Axolotl serverless endpoint `dheii186pfcuq7`.
|
||||
|
||||
## What changed
|
||||
|
||||
- Payload alignment was tightened toward the official Axolotl serverless schema:
|
||||
- added `model_type=AutoModelForCausalLM`
|
||||
- added `tokenizer_type=AutoTokenizer`
|
||||
- switched dataset split declaration to `split: train`
|
||||
- switched optimizer from `adamw_8bit` to `adamw_torch_fused`
|
||||
- Both submit paths now distinguish between:
|
||||
- `/run` accepted
|
||||
- `/status/{job}` actually exists
|
||||
- Updated files:
|
||||
- `magatama/packages/dashboard/src/server.ts`
|
||||
- `magatama/scripts/submit_runpod_training.ts`
|
||||
|
||||
## Verified behavior
|
||||
|
||||
### Full run attempt
|
||||
|
||||
- Submitted `magatamallm` 500-step run.
|
||||
- Returned job id: `9bc4b16b-755b-465b-aadf-b46f2fe467a3-e2`
|
||||
- Reconcile result shortly after:
|
||||
- `not_found_after_submit`
|
||||
- HTTP `404`
|
||||
- `job not found`
|
||||
|
||||
### Canary run after payload/schema fix
|
||||
|
||||
- Submitted `magatamallm` seed-only canary.
|
||||
- Returned job id: `a4ac6951-7ed7-43cb-80d8-5ab61533c2da-e2`
|
||||
- Immediate submit-side verification saw real queue materialization:
|
||||
- `runpod_status: IN_QUEUE`
|
||||
- Reconcile roughly 45 seconds later still observed:
|
||||
- `not_found_after_submit`
|
||||
- HTTP `404`
|
||||
- `job not found`
|
||||
|
||||
## Conclusion
|
||||
|
||||
The old MAGATAMA bug (blindly trusting `/run`) is fixed.
|
||||
|
||||
The remaining problem is now narrower and likely external to MAGATAMA itself:
|
||||
|
||||
- RunPod serverless currently accepts the submit and briefly materializes the job as `IN_QUEUE`,
|
||||
- but the job disappears before a durable status/progress/completion lifecycle can be observed.
|
||||
|
||||
This means the endpoint/release is still not trustworthy enough for a full production training launch until it can keep a job alive beyond the initial queue stage.
|
||||
|
||||
## Operational rule
|
||||
|
||||
Do **not** treat `submitted` or even a brief `IN_QUEUE` as proof of a usable serverless training run.
|
||||
A MAGATAMA serverless training run is only trustworthy when at least one of these is true:
|
||||
|
||||
- status progresses to `IN_PROGRESS`, or
|
||||
- a durable terminal state is observed with artifact evidence.
|
||||
|
||||
## Open next step
|
||||
|
||||
- Inspect the actual RunPod serverless endpoint/release configuration and worker-side logs in RunPod UI.
|
||||
- Only launch the full MagatamaLLM run after a canary survives beyond queue materialization.
|
||||
@ -0,0 +1,50 @@
|
||||
# 2026-05-06 — MAGATAMA RunPod Status Truthfulness
|
||||
|
||||
## Why this was needed
|
||||
|
||||
After the script/registry repair, MAGATAMA could refresh the local RunPod datasets again, but the operator-facing status flow was still too coarse:
|
||||
|
||||
- failures in local dataset preparation
|
||||
- failures in optional Hugging Face publish
|
||||
- and actual RunPod availability
|
||||
|
||||
were too easy to confuse.
|
||||
|
||||
This produced the impression that “RunPod is broken” even when the real problem was just dataset preparation on Erik.
|
||||
|
||||
## Changes
|
||||
|
||||
Patched:
|
||||
|
||||
- `magatama/packages/dashboard/src/server.ts`
|
||||
|
||||
Behavior now:
|
||||
|
||||
- dataset source is normalized to either:
|
||||
- `huggingface`
|
||||
- `url`
|
||||
- local dataset refresh (`training:refresh-all`) is wrapped with a dedicated error:
|
||||
- `Dataset-Refresh fehlgeschlagen: ...`
|
||||
- Hugging Face publish is wrapped with a dedicated error:
|
||||
- `HuggingFace-Publish fehlgeschlagen: ...`
|
||||
- if Hugging Face mode is selected but `HF_TOKEN` is missing, this is reported directly
|
||||
- after successful preparation, the SSE stream now explicitly states:
|
||||
- Hugging Face dataset source in use
|
||||
- or URL-bundle dataset source in use, with no external publish required
|
||||
|
||||
## Live effect
|
||||
|
||||
The dashboard process was rebuilt and restarted on Erik after this change.
|
||||
|
||||
Result:
|
||||
|
||||
- RunPod preparation status is more honest
|
||||
- operators can distinguish:
|
||||
- data refresh problem
|
||||
- optional external publish problem
|
||||
- actual RunPod training job submission/polling problem
|
||||
|
||||
## Notes
|
||||
|
||||
- This does not itself force a Hugging Face publish.
|
||||
- It only makes the control plane truthful about what step is happening and what actually failed.
|
||||
@ -0,0 +1,40 @@
|
||||
# 2026-05-06 — MAGATAMA training count source fix
|
||||
|
||||
## Summary
|
||||
|
||||
MAGATAMA training UI was still showing `1097` because the dashboard counted the legacy deduplicated fix corpus instead of the current lane-specific RunPod export.
|
||||
|
||||
## Root cause
|
||||
|
||||
- Dashboard training summary read `getTrainingCorpusStats()` from `gitea-learning-pool/magatamallm/fixes.jsonl`.
|
||||
- Live Erik state still had a huge raw `fixes.jsonl` and an old dedupe-derived effective count path.
|
||||
- The actual current training source for RunPod is the lane export under:
|
||||
- `training-data/runpod/magatamallm/magatamallm-sft-train.jsonl`
|
||||
- `training-data/runpod/magatamallm/magatamallm-sft-eval.jsonl`
|
||||
- `training-data/runpod/magatamallm/manifest.json`
|
||||
|
||||
## Fix
|
||||
|
||||
- `packages/dashboard/src/server.ts` now prefers the lane manifest for `magatamallm` training counts.
|
||||
- Live summary now uses:
|
||||
- `train = 1367`
|
||||
- `eval = 152`
|
||||
- `totalAfterDedupe = 1519`
|
||||
- `duplicatesRemoved = 1368`
|
||||
- Synced the current local `training-data/runpod/magatamallm/` directory to Erik.
|
||||
- Restarted `magatama-dashboard`.
|
||||
|
||||
## Verified live
|
||||
|
||||
Public API now returns:
|
||||
|
||||
- `training.collectedExamples = 1367`
|
||||
- `training.effectiveExamples = 1367`
|
||||
- `training.evalExamples = 152`
|
||||
- `training.totalExamples = 1519`
|
||||
- `training.newSinceLastTraining = 1367`
|
||||
- `training.collectionsPath = /opt/magatama/training-data/runpod/magatamallm/manifest.json`
|
||||
|
||||
## Operator note
|
||||
|
||||
If the UI still shows `1097`, it is a browser cache/stale page issue. Hard reload the MAGATAMA dashboard.
|
||||
@ -0,0 +1,137 @@
|
||||
# MAGATAMA UI / Cache / RunPod / Tooltip / Changelog Fix
|
||||
|
||||
Date: 2026-05-06
|
||||
Author: Codex
|
||||
|
||||
## Scope
|
||||
|
||||
Addressed the current MAGATAMA operator complaints in one block:
|
||||
|
||||
- training UI still showed `1097`
|
||||
- findings page looked blank
|
||||
- attack paths looked empty/broken
|
||||
- Switchblade port hover only showed a help cursor / question mark
|
||||
- changelog looked stale
|
||||
|
||||
## What Was Fixed
|
||||
|
||||
### 1. Training truth source
|
||||
|
||||
`magatamallm` RunPod launches still logged the old legacy deduplicated `fixes.jsonl` count (`1097`) during SSE startup.
|
||||
|
||||
This was corrected so RunPod launches now:
|
||||
|
||||
- still dedupe the legacy fix corpus where needed
|
||||
- but no longer present that count as the operator-facing training truth
|
||||
- instead emit the lane-specific RunPod manifest totals after dataset refresh
|
||||
|
||||
Live verified via public MAGATAMA API:
|
||||
|
||||
- `collectedExamples = 15620`
|
||||
- `effectiveExamples = 15620`
|
||||
- `evalExamples = 1736`
|
||||
- `totalExamples = 17356`
|
||||
- `newSinceLastTraining = 15620`
|
||||
|
||||
### 2. RunPod completion truthfulness
|
||||
|
||||
RunPod worker jobs could return `COMPLETED` even though the logs contained real training failures.
|
||||
|
||||
MAGATAMA now inspects worker logs for markers such as:
|
||||
|
||||
- `Traceback`
|
||||
- `SyntaxError`
|
||||
- non-zero exit status
|
||||
- explicit train/fine-tune failure text
|
||||
|
||||
If such evidence exists, the run is recorded as worker-failed instead of being treated as a clean success.
|
||||
|
||||
### 3. Findings page no longer looks broken when empty
|
||||
|
||||
The live findings API currently returns:
|
||||
|
||||
- `findings = []`
|
||||
- `total = 0`
|
||||
|
||||
The UI now renders an explicit empty-state row when there are no open findings or when filters hide everything, instead of leaving the table visually blank.
|
||||
|
||||
### 4. Attack Paths empty-state clarified
|
||||
|
||||
Attack Paths previously looked broken when the selected scope had zero assets.
|
||||
|
||||
The UI now explicitly states:
|
||||
|
||||
- the current scope has `0 assets`
|
||||
- operators should widen location/datacenter/rack scope
|
||||
- the graph stays intentionally empty when no correlated multi-step paths exist
|
||||
|
||||
### 5. Frontend cache + scope hardening
|
||||
|
||||
Frontend cache handling was improved:
|
||||
|
||||
- cache version bumped to `2026-05-06b`
|
||||
- stale legacy `magatama_api_cache:*` entries are cleared
|
||||
- per-endpoint TTLs were introduced
|
||||
- invalid scope selections are normalized
|
||||
- empty scoped selections reset rather than silently trapping the UI in misleading empty views
|
||||
|
||||
### 6. Switchblade port hover improved
|
||||
|
||||
The old port chips relied only on browser-native `title` behavior.
|
||||
|
||||
Now:
|
||||
|
||||
- port chips carry `data-tooltip`
|
||||
- custom tooltip CSS is shipped live
|
||||
- usage/state text should appear as a real hover bubble
|
||||
|
||||
Live Erik file check confirmed:
|
||||
|
||||
- `data-tooltip` markers present
|
||||
- tooltip CSS present
|
||||
|
||||
### 7. Changelog self-healing
|
||||
|
||||
The public changelog cache in MAGATAMA core previously returned cached data indefinitely if structurally valid.
|
||||
|
||||
Now:
|
||||
|
||||
- cached changelog older than 6 hours triggers a rebuild from git history
|
||||
|
||||
Live verified on Erik through dashboard proxy:
|
||||
|
||||
- `generatedAt = 2026-05-06T15:18:42.708Z`
|
||||
- latest entries include fresh `2026-04-30` material again
|
||||
|
||||
## Files Touched In MAGATAMA
|
||||
|
||||
- `packages/dashboard/public/index-v2.html`
|
||||
- `packages/dashboard/src/server.ts`
|
||||
- `packages/core/src/routes/changelog.ts`
|
||||
|
||||
## Deployment Status
|
||||
|
||||
Built locally and redeployed to Erik:
|
||||
|
||||
- dashboard dist synced
|
||||
- core dist synced
|
||||
- `index-v2.html` synced
|
||||
- PM2 restarted:
|
||||
- `magatama-dashboard`
|
||||
- `magatama`
|
||||
|
||||
## Important Live Evidence
|
||||
|
||||
- public `api/llm/status` shows lane-export counts, not `1097`
|
||||
- public `api/findings?limit=1` returns empty findings cleanly
|
||||
- Erik live dashboard file contains:
|
||||
- `API_CACHE_VERSION = '2026-05-06b'`
|
||||
- `data-tooltip`
|
||||
- `Im aktuellen Scope liegen 0 Assets.`
|
||||
- `Klicken für Details`
|
||||
|
||||
## Open Truths
|
||||
|
||||
- current live findings are genuinely `0`; this is not a hidden frontend-only failure
|
||||
- Attack Paths can still be empty if there are truly no scoped assets or no correlated attack stories
|
||||
- RunPod serverless still needs endpoint-side reliability; the MAGATAMA-side truthfulness improvements do not by themselves fix a broken RunPod release/worker pipeline
|
||||
Loading…
x
Reference in New Issue
Block a user