1.6 KiB
1.6 KiB
2026-05-06 — MAGATAMA RunPod Status Truthfulness
Why this was needed
After the script/registry repair, MAGATAMA could refresh the local RunPod datasets again, but the operator-facing status flow was still too coarse:
- failures in local dataset preparation
- failures in optional Hugging Face publish
- and actual RunPod availability
were too easy to confuse.
This produced the impression that “RunPod is broken” even when the real problem was just dataset preparation on Erik.
Changes
Patched:
magatama/packages/dashboard/src/server.ts
Behavior now:
- dataset source is normalized to either:
huggingfaceurl
- local dataset refresh (
training:refresh-all) is wrapped with a dedicated error:Dataset-Refresh fehlgeschlagen: ...
- Hugging Face publish is wrapped with a dedicated error:
HuggingFace-Publish fehlgeschlagen: ...
- if Hugging Face mode is selected but
HF_TOKENis missing, this is reported directly - after successful preparation, the SSE stream now explicitly states:
- Hugging Face dataset source in use
- or URL-bundle dataset source in use, with no external publish required
Live effect
The dashboard process was rebuilt and restarted on Erik after this change.
Result:
- RunPod preparation status is more honest
- operators can distinguish:
- data refresh problem
- optional external publish problem
- actual RunPod training job submission/polling problem
Notes
- This does not itself force a Hugging Face publish.
- It only makes the control plane truthful about what step is happening and what actually failed.