From b5d9b4df0381fe7815dcfcc7d7e8d1e359d8c576 Mon Sep 17 00:00:00 2001
From: Rene Fichtmueller <renefichtmueller@MacStudio-von-Rene-8.local>
Date: Wed, 6 May 2026 12:18:17 +0200
Subject: [PATCH] sync: record runpod status truthfulness hardening

---
 sync/CURRENT.md                               | 12 ++++-
 ...-06-magatama-runpod-status-truthfulness.md | 50 +++++++++++++++++++
 2 files changed, 61 insertions(+), 1 deletion(-)
 create mode 100644 sync/history/2026-05-06-magatama-runpod-status-truthfulness.md

diff --git a/sync/CURRENT.md b/sync/CURRENT.md
index 5348cea..918afe1 100644
--- a/sync/CURRENT.md
+++ b/sync/CURRENT.md
@@ -1,6 +1,6 @@
 # Current TIP Sync State
 
-Updated: 2026-05-06 12:02 UTC
+Updated: 2026-05-06 12:21 UTC
 
 ## Active Policy
 
@@ -60,6 +60,16 @@ When work touches TIP, Magatama, LLM Gateway, bridges, auth, or shared Erik infr
   - host-specific posture for Proxmox / Erik / Mac Studio remains the job of explicit host-audit logic.
   - after rebuild + deploy + health sync:
     - live Postgres open findings returned to `0`.
+- Follow-up hardening on the same block:
+  - the earlier RunPod error path in MAGATAMA dashboard was made more truthful.
+  - dataset preparation now distinguishes:
+    - local `training:refresh-all` failure
+    - optional Hugging Face publish failure
+    - URL-based dataset mode with no external publish required
+  - the training SSE flow now explicitly tells the operator whether RunPod is using:
+    - Hugging Face dataset source
+    - or MAGATAMA URL-bundle dataset source
+  - this avoids misleading `RunPod not reachable` wording when the actual failure is in dataset preparation.
 
 - MAGATAMA was repaired end-to-end to a clean operational baseline:
   - live guard host-audits for Erik, Mac Studio, and Proxmox were corrected and rerun.
diff --git a/sync/history/2026-05-06-magatama-runpod-status-truthfulness.md b/sync/history/2026-05-06-magatama-runpod-status-truthfulness.md
new file mode 100644
index 0000000..3752e8c
--- /dev/null
+++ b/sync/history/2026-05-06-magatama-runpod-status-truthfulness.md
@@ -0,0 +1,50 @@
+# 2026-05-06 — MAGATAMA RunPod Status Truthfulness
+
+## Why this was needed
+
+After the script/registry repair, MAGATAMA could refresh the local RunPod datasets again, but the operator-facing status flow was still too coarse:
+
+- failures in local dataset preparation
+- failures in optional Hugging Face publish
+- and actual RunPod availability
+
+were too easy to confuse.
+
+This produced the impression that “RunPod is broken” even when the real problem was just dataset preparation on Erik.
+
+## Changes
+
+Patched:
+
+- `magatama/packages/dashboard/src/server.ts`
+
+Behavior now:
+
+- dataset source is normalized to either:
+  - `huggingface`
+  - `url`
+- local dataset refresh (`training:refresh-all`) is wrapped with a dedicated error:
+  - `Dataset-Refresh fehlgeschlagen: ...`
+- Hugging Face publish is wrapped with a dedicated error:
+  - `HuggingFace-Publish fehlgeschlagen: ...`
+- if Hugging Face mode is selected but `HF_TOKEN` is missing, this is reported directly
+- after successful preparation, the SSE stream now explicitly states:
+  - Hugging Face dataset source in use
+  - or URL-bundle dataset source in use, with no external publish required
+
+## Live effect
+
+The dashboard process was rebuilt and restarted on Erik after this change.
+
+Result:
+
+- RunPod preparation status is more honest
+- operators can distinguish:
+  - data refresh problem
+  - optional external publish problem
+  - actual RunPod training job submission/polling problem
+
+## Notes
+
+- This does not itself force a Hugging Face publish.
+- It only makes the control plane truthful about what step is happening and what actually failed.