1.5 KiB
1.5 KiB
MAGATAMA All-Lane RunPod Training Start
Date: 2026-05-09 23:09 UTC
Scope
- Train all current MAGATAMA LLM lanes via RunPod:
magatamallmfo_blogllmtip_llmpulso_llmcontact_llm
Preflight
- MAGATAMA services were online on Erik.
- Active RunPod endpoint reported by MAGATAMA:
0rmkf28w2g5gip. - RunPod worker kind:
custom-magatama. - Dataset source: URL-based lane export.
- Previous successful/adopted runs existed for:
magatamallmfo_blogllmtip_llm
- No previous run existed yet for:
pulso_llmcontact_llm
Runner Fix
- Fixed
scripts/trigger_lane_training_once.pylocally and on Erik. - The script previously used stale API keys:
iterationsseedOnly
- The MAGATAMA training API expects:
itersseed_only
- Added
allmode to run all lanes sequentially. - Added streamed SSE logging so progress is visible during long RunPod runs.
Live Run
- Started on Erik:
python3 -u scripts/trigger_lane_training_once.py all 500 false
- Log:
/opt/magatama/logs/runpod-all-lanes-20260509T230549Z.log
- First active lane:
magatamallm
- First RunPod job:
89627e7e-8533-45db-9fe8-eca994018aa6-e2
- Initial
magatamallmdataset:1375 train153 eval1528 total
Success Rule
- Do not treat RunPod
COMPLETEDas success by itself. - A lane is only successful when:
- the model artifact exists,
- MAGATAMA imports/adopts it locally,
- smoke checks pass,
- the active alias/version is updated.