sync: record fo blogllm adoption closure
This commit is contained in:
parent
56ed88ac8c
commit
3779de5b88
@ -1,9 +1,53 @@
|
|||||||
# Current TIP Sync State
|
# Current TIP Sync State
|
||||||
|
|
||||||
Updated: 2026-05-09 16:20 UTC
|
Updated: 2026-05-09 18:07 UTC
|
||||||
|
|
||||||
## Newest Work
|
## Newest Work
|
||||||
|
|
||||||
|
- MAGATAMA FO_BlogLLM RunPod training and adoption closure on 2026-05-09:
|
||||||
|
- operator requirement:
|
||||||
|
- training success must only count after artifact exists, local import works, smoke tests pass, Ollama alias/version switches, remote MAGATAMA registry is updated, and the live UI reports no active stale job
|
||||||
|
- no repeat of failed "COMPLETED but nothing adopted" serverless runs
|
||||||
|
- local Mac Studio training remains throttled by default to avoid saturating the workstation
|
||||||
|
- RunPod job completed:
|
||||||
|
- endpoint `0rmkf28w2g5gip`
|
||||||
|
- job `99d08ef2-9016-4488-ac69-3585c8a09f38-e2`
|
||||||
|
- run id `fo_blogllm-2026-05-09T17-14-16`
|
||||||
|
- target artifact `renefichtmueller/magatama-fo-blogllm-fo-blogllm-2026-05-09t17-14-16`
|
||||||
|
- worker summary `RunPod QLoRA complete · train=11473 · valid=1281`
|
||||||
|
- failure recovered:
|
||||||
|
- first local adoption failed because Mac Studio disk filled during F16 GGUF conversion
|
||||||
|
- removed stale partial F16 GGUF and obsolete merged safetensors to restore free space
|
||||||
|
- hardened importer to:
|
||||||
|
- require minimum free disk before conversion
|
||||||
|
- delete stale partial F16 before retry
|
||||||
|
- reuse existing GGUF when present
|
||||||
|
- delete temporary F16 in all cases
|
||||||
|
- remove merged safetensors/bin after successful Ollama registration unless `.keep-merged` exists
|
||||||
|
- adoption completed:
|
||||||
|
- local candidate `fo-blogllm-runpod-fo_blogllm-2026-05-09t17-14-16`
|
||||||
|
- release alias `fo-blog-v7-r1`
|
||||||
|
- active alias `fo-blog-v7`
|
||||||
|
- candidate smoke `5/5` passed
|
||||||
|
- direct local smoke returned exact `FO-BLOG-V7-READY`
|
||||||
|
- dashboard/server hardening:
|
||||||
|
- old baseline smoke is now non-blocking when the active alias does not exist yet; candidate smoke remains mandatory
|
||||||
|
- deployed updated dashboard bundle, fine-tuner API template, and RunPod-Ollama importer to Erik
|
||||||
|
- restarted `magatama-dashboard`
|
||||||
|
- copied `fo_blogllm-last_run.json` and adoption report to Erik
|
||||||
|
- appended remote training registry event `completed_and_adopted`
|
||||||
|
- live verification:
|
||||||
|
- `fo_blogllm` reports `activeProvider=ollama:fo-blog-v7`
|
||||||
|
- `modelVersion=fo-blog-v7-r1`
|
||||||
|
- `lastRegistryRunStatus=completed_and_adopted`
|
||||||
|
- `activeRun=null`
|
||||||
|
- `collectedExamples=17322`, `evalExamples=1926`, `totalExamples=19267`
|
||||||
|
- `newSinceLastTraining=0`
|
||||||
|
- `tip_llm` remains healthy with `tip-llm-v1-r1`, `activeRun=null`, `newSinceLastTraining=0`
|
||||||
|
- open:
|
||||||
|
- run the same end-to-end custom-worker/adoption path for `magatamallm`
|
||||||
|
- complete dual-Gitea mirroring as separate infrastructure closure item
|
||||||
|
|
||||||
- Near-complete detail queue closed with lightweight vendor detail verifiers on 2026-05-09:
|
- Near-complete detail queue closed with lightweight vendor detail verifiers on 2026-05-09:
|
||||||
- operator requirement:
|
- operator requirement:
|
||||||
- keep Erik safe; no heavy browser crawler or Playwright wave
|
- keep Erik safe; no heavy browser crawler or Playwright wave
|
||||||
|
|||||||
@ -0,0 +1,57 @@
|
|||||||
|
# FO_BlogLLM RunPod Adoption Closure
|
||||||
|
|
||||||
|
Date: 2026-05-09 18:07 UTC
|
||||||
|
|
||||||
|
## What Changed
|
||||||
|
|
||||||
|
- Completed the FO_BlogLLM RunPod full training and local adoption path.
|
||||||
|
- Recovered from the first adoption failure caused by Mac Studio disk exhaustion during F16 GGUF conversion.
|
||||||
|
- Hardened the local importer and Train API so future RunPod jobs fail only on real candidate/adoption problems, not on missing previous baseline aliases.
|
||||||
|
- Deployed the updated MAGATAMA dashboard bundle and training helper files to Erik.
|
||||||
|
- Synced the successful adoption metadata back into MAGATAMA's remote training registry.
|
||||||
|
|
||||||
|
## Run Details
|
||||||
|
|
||||||
|
- Lane: `fo_blogllm`
|
||||||
|
- Endpoint: `0rmkf28w2g5gip`
|
||||||
|
- Job: `99d08ef2-9016-4488-ac69-3585c8a09f38-e2`
|
||||||
|
- Run id: `fo_blogllm-2026-05-09T17-14-16`
|
||||||
|
- HF artifact: `renefichtmueller/magatama-fo-blogllm-fo-blogllm-2026-05-09t17-14-16`
|
||||||
|
- Local candidate: `fo-blogllm-runpod-fo_blogllm-2026-05-09t17-14-16`
|
||||||
|
- Release alias: `fo-blog-v7-r1`
|
||||||
|
- Active alias: `fo-blog-v7`
|
||||||
|
- Candidate smoke: `5/5`
|
||||||
|
- Direct local smoke: exact `FO-BLOG-V7-READY`
|
||||||
|
|
||||||
|
## Live Verification
|
||||||
|
|
||||||
|
- MAGATAMA FO_BlogLLM status:
|
||||||
|
- `activeProvider=ollama:fo-blog-v7`
|
||||||
|
- `modelVersion=fo-blog-v7-r1`
|
||||||
|
- `lastRegistryRunStatus=completed_and_adopted`
|
||||||
|
- `activeRun=null`
|
||||||
|
- `newSinceLastTraining=0`
|
||||||
|
- `collectedExamples=17322`
|
||||||
|
- `evalExamples=1926`
|
||||||
|
- `totalExamples=19267`
|
||||||
|
- TIP_LLM stayed healthy:
|
||||||
|
- `activeProvider=ollama:tip-llm-v1`
|
||||||
|
- `modelVersion=tip-llm-v1-r1`
|
||||||
|
- `activeRun=null`
|
||||||
|
- `newSinceLastTraining=0`
|
||||||
|
|
||||||
|
## Decisions
|
||||||
|
|
||||||
|
- Baseline smoke is comparison-only and must not block first adoption if the old active alias is missing.
|
||||||
|
- Candidate smoke remains mandatory and blocks adoption if it fails.
|
||||||
|
- Importer must keep Mac Studio safe:
|
||||||
|
- verify enough free disk before conversion
|
||||||
|
- delete stale partial F16 files before retry
|
||||||
|
- delete F16 temp files in all cases
|
||||||
|
- clean merged safetensors/bin after successful registration unless `.keep-merged` exists
|
||||||
|
- A RunPod run is not considered successful in MAGATAMA until the active Ollama alias and remote registry both reflect the new release.
|
||||||
|
|
||||||
|
## Open
|
||||||
|
|
||||||
|
- Repeat the same custom-worker/adoption closure for `magatamallm`.
|
||||||
|
- Complete Gitea-to-Proxmox Gitea mirroring separately.
|
||||||
Loading…
x
Reference in New Issue
Block a user