2.8 KiB
2.8 KiB
MagatamaLLM RunPod Adoption Closure
Date: 2026-05-09 20:25 UTC
What Changed
- Completed the MagatamaLLM RunPod training closure without launching a new paid RunPod job.
- Recovered the local adoption path after the RunPod worker had already trained and uploaded the adapter successfully.
- Deployed a MAGATAMA dashboard server fix so the live training status reflects the final adopted model instead of stale
completed_not_adoptedmetadata. - Synced the adoption metadata back to Erik and verified the public MAGATAMA status endpoint.
Run Details
- Lane:
magatamallm - Endpoint:
0rmkf28w2g5gip - Job:
a46de2ef-96e0-4adf-bbf8-d7a890e06c6f-e2 - Run id:
magatamallm-2026-05-09T19-22-53 - HF artifact:
renefichtmueller/magatama-magatamallm-magatamallm-2026-05-09t19-22-53 - Worker summary:
RunPod QLoRA complete · train=605 · valid=114 - Local candidate:
magatamallm-runpod-magatamallm-2026-05-09t19-22-53 - Release alias:
magatama-coder-r1 - Active alias:
magatama-coder:latest - Candidate smoke:
4/5with required threshold4 - Direct local smoke: exact
MAGATAMA-R1-READY
Failure Recovery
- First adoption failed because Mac Studio had too little free disk for GGUF conversion after writing the merged model.
- Removed only safe temporary/import blockers:
- failed MagatamaLLM merged
model.safetensors - FO_BlogLLM/TIP_LLM source GGUF import files that were already registered in Ollama
- old non-active Ollama test model
test-qwen32b:latest
- failed MagatamaLLM merged
- Active aliases remained intact:
magatama-coder:latestfo-blog-v7tip-llm-v1
Dashboard Fix
- Registry ordering now uses
recorded_atwith fallback tocompleted_at,adopted_at, andcreated_at. - Successful adoption version selection now accepts top-level
release_aliasandcandidate_model, not only nestedadoption.*payloads. - Legacy MagatamaLLM baseline mismatch protection no longer invalidates the RunPod lane export.
- Deployed rebuilt
packages/dashboard/dist/server.jsto Erik and restartedmagatama-dashboard.
Live Verification
- MAGATAMA
magatamallmstatus:activeProvider=ollama:magatama-coder:latestmodelVersion=magatama-coder-r1lastRegistryRunStatus=completed_and_adoptedactiveRun=nullhasTrustedTrainingBaseline=truenewSinceLastTraining=0collectedExamples=1367evalExamples=152totalExamples=1519
- FO_BlogLLM stayed healthy:
modelVersion=fo-blog-v7-r1activeRun=nullnewSinceLastTraining=0
- TIP_LLM stayed healthy:
modelVersion=tip-llm-v1-r1activeRun=nullnewSinceLastTraining=0
Open
- Add more explicit MagatamaLLM examples for the rule: insufficient evidence means escalate/manual review rather than passive monitoring.
- Complete dual-Gitea mirroring separately.