transceiver-db/sync/history/2026-05-09-fo-blogllm-runpod-adoption-closure.md
2026-05-09 20:10:27 +02:00

58 lines
2.2 KiB
Markdown

# FO_BlogLLM RunPod Adoption Closure
Date: 2026-05-09 18:07 UTC
## What Changed
- Completed the FO_BlogLLM RunPod full training and local adoption path.
- Recovered from the first adoption failure caused by Mac Studio disk exhaustion during F16 GGUF conversion.
- Hardened the local importer and Train API so future RunPod jobs fail only on real candidate/adoption problems, not on missing previous baseline aliases.
- Deployed the updated MAGATAMA dashboard bundle and training helper files to Erik.
- Synced the successful adoption metadata back into MAGATAMA's remote training registry.
## Run Details
- Lane: `fo_blogllm`
- Endpoint: `0rmkf28w2g5gip`
- Job: `99d08ef2-9016-4488-ac69-3585c8a09f38-e2`
- Run id: `fo_blogllm-2026-05-09T17-14-16`
- HF artifact: `renefichtmueller/magatama-fo-blogllm-fo-blogllm-2026-05-09t17-14-16`
- Local candidate: `fo-blogllm-runpod-fo_blogllm-2026-05-09t17-14-16`
- Release alias: `fo-blog-v7-r1`
- Active alias: `fo-blog-v7`
- Candidate smoke: `5/5`
- Direct local smoke: exact `FO-BLOG-V7-READY`
## Live Verification
- MAGATAMA FO_BlogLLM status:
- `activeProvider=ollama:fo-blog-v7`
- `modelVersion=fo-blog-v7-r1`
- `lastRegistryRunStatus=completed_and_adopted`
- `activeRun=null`
- `newSinceLastTraining=0`
- `collectedExamples=17322`
- `evalExamples=1926`
- `totalExamples=19267`
- TIP_LLM stayed healthy:
- `activeProvider=ollama:tip-llm-v1`
- `modelVersion=tip-llm-v1-r1`
- `activeRun=null`
- `newSinceLastTraining=0`
## Decisions
- Baseline smoke is comparison-only and must not block first adoption if the old active alias is missing.
- Candidate smoke remains mandatory and blocks adoption if it fails.
- Importer must keep Mac Studio safe:
- verify enough free disk before conversion
- delete stale partial F16 files before retry
- delete F16 temp files in all cases
- clean merged safetensors/bin after successful registration unless `.keep-merged` exists
- A RunPod run is not considered successful in MAGATAMA until the active Ollama alias and remote registry both reflect the new release.
## Open
- Repeat the same custom-worker/adoption closure for `magatamallm`.
- Complete Gitea-to-Proxmox Gitea mirroring separately.