sync: record magatama ui cache runpod tooltip changelog fix
This commit is contained in:
parent
77a4aab592
commit
830ab57c3c
@ -1,6 +1,6 @@
|
||||
# Current TIP Sync State
|
||||
|
||||
Updated: 2026-05-06 12:21 UTC
|
||||
Updated: 2026-05-06 15:24 UTC
|
||||
|
||||
## Active Policy
|
||||
|
||||
@ -27,6 +27,44 @@ When work touches TIP, Magatama, LLM Gateway, bridges, auth, or shared Erik infr
|
||||
|
||||
## Latest Work
|
||||
|
||||
- MAGATAMA frontend/runtime consistency was repaired again on 2026-05-06:
|
||||
- dashboard and core were rebuilt locally and redeployed to Erik.
|
||||
- live processes restarted successfully:
|
||||
- `magatama-dashboard`
|
||||
- `magatama`
|
||||
- public `api/llm/status` now shows the true lane-export totals for `magatamallm`:
|
||||
- `collectedExamples = 15620`
|
||||
- `effectiveExamples = 15620`
|
||||
- `evalExamples = 1736`
|
||||
- `totalExamples = 17356`
|
||||
- `newSinceLastTraining = 15620`
|
||||
- root cause for the stale `1097` display:
|
||||
- the RunPod start SSE path still logged the legacy deduplicated `fixes.jsonl` corpus.
|
||||
- this was changed so RunPod launches no longer present the legacy `1097` count as the active training truth.
|
||||
- after dataset refresh the UI now emits the lane manifest totals instead.
|
||||
- RunPod completion handling was hardened:
|
||||
- worker `COMPLETED` is no longer trusted blindly.
|
||||
- MAGATAMA now scans RunPod worker logs for real training failures (`Traceback`, `SyntaxError`, non-zero exit, etc.) before treating the run as successful.
|
||||
- if the worker logs show a hidden failure, MAGATAMA records this as `completed_with_worker_failure` instead of pretending the run succeeded.
|
||||
- public findings state remains currently empty:
|
||||
- `GET /api/findings?limit=1` returned `{"findings":[],"total":0}`
|
||||
- this is now rendered with an explicit empty-state row instead of a visually blank table.
|
||||
- Attack Paths empty-state is now intentionally explicit rather than looking broken.
|
||||
- Frontend cache and scope handling were hardened:
|
||||
- cache version bumped to `2026-05-06b`
|
||||
- stale legacy `magatama_api_cache:*` entries are cleared
|
||||
- per-endpoint TTLs added
|
||||
- invalid or empty scope selections are normalized instead of silently leaving the UI in misleading empty views
|
||||
- Switchblade rack port hover was materially improved:
|
||||
- port chips now carry `data-tooltip`
|
||||
- custom tooltip CSS is live on Erik
|
||||
- the old browser-native “question mark only” behavior should be replaced by a readable hover bubble
|
||||
- Changelog self-healing was added in core:
|
||||
- stale cached changelog data older than 6h now forces a rebuild from git history
|
||||
- verified live via dashboard proxy on Erik:
|
||||
- `generatedAt = 2026-05-06T15:18:42.708Z`
|
||||
- latest visible entries include `2026-04-30` items again instead of appearing frozen at `30.05`
|
||||
|
||||
- MAGATAMA training + Attack Paths + Atlas exposure were corrected again on 2026-05-06:
|
||||
- the RunPod serverless training start failure was not a RunPod outage.
|
||||
- root cause was missing training scripts on Erik (`training_full_refresh.ts` and related helpers were absent under `/opt/magatama/scripts`).
|
||||
|
||||
@ -0,0 +1,137 @@
|
||||
# MAGATAMA UI / Cache / RunPod / Tooltip / Changelog Fix
|
||||
|
||||
Date: 2026-05-06
|
||||
Author: Codex
|
||||
|
||||
## Scope
|
||||
|
||||
Addressed the current MAGATAMA operator complaints in one block:
|
||||
|
||||
- training UI still showed `1097`
|
||||
- findings page looked blank
|
||||
- attack paths looked empty/broken
|
||||
- Switchblade port hover only showed a help cursor / question mark
|
||||
- changelog looked stale
|
||||
|
||||
## What Was Fixed
|
||||
|
||||
### 1. Training truth source
|
||||
|
||||
`magatamallm` RunPod launches still logged the old legacy deduplicated `fixes.jsonl` count (`1097`) during SSE startup.
|
||||
|
||||
This was corrected so RunPod launches now:
|
||||
|
||||
- still dedupe the legacy fix corpus where needed
|
||||
- but no longer present that count as the operator-facing training truth
|
||||
- instead emit the lane-specific RunPod manifest totals after dataset refresh
|
||||
|
||||
Live verified via public MAGATAMA API:
|
||||
|
||||
- `collectedExamples = 15620`
|
||||
- `effectiveExamples = 15620`
|
||||
- `evalExamples = 1736`
|
||||
- `totalExamples = 17356`
|
||||
- `newSinceLastTraining = 15620`
|
||||
|
||||
### 2. RunPod completion truthfulness
|
||||
|
||||
RunPod worker jobs could return `COMPLETED` even though the logs contained real training failures.
|
||||
|
||||
MAGATAMA now inspects worker logs for markers such as:
|
||||
|
||||
- `Traceback`
|
||||
- `SyntaxError`
|
||||
- non-zero exit status
|
||||
- explicit train/fine-tune failure text
|
||||
|
||||
If such evidence exists, the run is recorded as worker-failed instead of being treated as a clean success.
|
||||
|
||||
### 3. Findings page no longer looks broken when empty
|
||||
|
||||
The live findings API currently returns:
|
||||
|
||||
- `findings = []`
|
||||
- `total = 0`
|
||||
|
||||
The UI now renders an explicit empty-state row when there are no open findings or when filters hide everything, instead of leaving the table visually blank.
|
||||
|
||||
### 4. Attack Paths empty-state clarified
|
||||
|
||||
Attack Paths previously looked broken when the selected scope had zero assets.
|
||||
|
||||
The UI now explicitly states:
|
||||
|
||||
- the current scope has `0 assets`
|
||||
- operators should widen location/datacenter/rack scope
|
||||
- the graph stays intentionally empty when no correlated multi-step paths exist
|
||||
|
||||
### 5. Frontend cache + scope hardening
|
||||
|
||||
Frontend cache handling was improved:
|
||||
|
||||
- cache version bumped to `2026-05-06b`
|
||||
- stale legacy `magatama_api_cache:*` entries are cleared
|
||||
- per-endpoint TTLs were introduced
|
||||
- invalid scope selections are normalized
|
||||
- empty scoped selections reset rather than silently trapping the UI in misleading empty views
|
||||
|
||||
### 6. Switchblade port hover improved
|
||||
|
||||
The old port chips relied only on browser-native `title` behavior.
|
||||
|
||||
Now:
|
||||
|
||||
- port chips carry `data-tooltip`
|
||||
- custom tooltip CSS is shipped live
|
||||
- usage/state text should appear as a real hover bubble
|
||||
|
||||
Live Erik file check confirmed:
|
||||
|
||||
- `data-tooltip` markers present
|
||||
- tooltip CSS present
|
||||
|
||||
### 7. Changelog self-healing
|
||||
|
||||
The public changelog cache in MAGATAMA core previously returned cached data indefinitely if structurally valid.
|
||||
|
||||
Now:
|
||||
|
||||
- cached changelog older than 6 hours triggers a rebuild from git history
|
||||
|
||||
Live verified on Erik through dashboard proxy:
|
||||
|
||||
- `generatedAt = 2026-05-06T15:18:42.708Z`
|
||||
- latest entries include fresh `2026-04-30` material again
|
||||
|
||||
## Files Touched In MAGATAMA
|
||||
|
||||
- `packages/dashboard/public/index-v2.html`
|
||||
- `packages/dashboard/src/server.ts`
|
||||
- `packages/core/src/routes/changelog.ts`
|
||||
|
||||
## Deployment Status
|
||||
|
||||
Built locally and redeployed to Erik:
|
||||
|
||||
- dashboard dist synced
|
||||
- core dist synced
|
||||
- `index-v2.html` synced
|
||||
- PM2 restarted:
|
||||
- `magatama-dashboard`
|
||||
- `magatama`
|
||||
|
||||
## Important Live Evidence
|
||||
|
||||
- public `api/llm/status` shows lane-export counts, not `1097`
|
||||
- public `api/findings?limit=1` returns empty findings cleanly
|
||||
- Erik live dashboard file contains:
|
||||
- `API_CACHE_VERSION = '2026-05-06b'`
|
||||
- `data-tooltip`
|
||||
- `Im aktuellen Scope liegen 0 Assets.`
|
||||
- `Klicken für Details`
|
||||
|
||||
## Open Truths
|
||||
|
||||
- current live findings are genuinely `0`; this is not a hidden frontend-only failure
|
||||
- Attack Paths can still be empty if there are truly no scoped assets or no correlated attack stories
|
||||
- RunPod serverless still needs endpoint-side reliability; the MAGATAMA-side truthfulness improvements do not by themselves fix a broken RunPod release/worker pipeline
|
||||
Loading…
x
Reference in New Issue
Block a user