# MAGATAMA All-Lane RunPod Training Start Date: 2026-05-09 23:09 UTC ## Scope - Train all current MAGATAMA LLM lanes via RunPod: - `magatamallm` - `fo_blogllm` - `tip_llm` - `pulso_llm` - `contact_llm` ## Preflight - MAGATAMA services were online on Erik. - Active RunPod endpoint reported by MAGATAMA: `0rmkf28w2g5gip`. - RunPod worker kind: `custom-magatama`. - Dataset source: URL-based lane export. - Previous successful/adopted runs existed for: - `magatamallm` - `fo_blogllm` - `tip_llm` - No previous run existed yet for: - `pulso_llm` - `contact_llm` ## Runner Fix - Fixed `scripts/trigger_lane_training_once.py` locally and on Erik. - MAGATAMA Gitea commit: `76d4054`. - The script previously used stale API keys: - `iterations` - `seedOnly` - The MAGATAMA training API expects: - `iters` - `seed_only` - Added `all` mode to run all lanes sequentially. - Added streamed SSE logging so progress is visible during long RunPod runs. ## Live Run - Started on Erik: - `python3 -u scripts/trigger_lane_training_once.py all 500 false` - Log: - `/opt/magatama/logs/runpod-all-lanes-20260509T230549Z.log` - First active lane: - `magatamallm` - First RunPod job: - `89627e7e-8533-45db-9fe8-eca994018aa6-e2` - Initial `magatamallm` dataset: - `1375 train` - `153 eval` - `1528 total` ## Success Rule - Do not treat RunPod `COMPLETED` as success by itself. - A lane is only successful when: - the model artifact exists, - MAGATAMA imports/adopts it locally, - smoke checks pass, - the active alias/version is updated.