# MAGATAMA Custom RunPod Worker Build/Publish Prep

Date: 2026-05-07

## What Changed

- committed and pushed the previously pending RunPod root-cause sync handoff:
  - `2a35761 sync: record runpod managed endpoint root cause`
- added a real custom-worker build/publish helper to MAGATAMA:
  - `magatama/scripts/runpod_worker_publish.sh`
- added package entrypoint:
  - `pnpm runpod:worker:publish`
- extended:
  - `magatama/packages/fine-tuner/RUNPOD.md`
  so the target end-to-end automation path is documented from lane pool through alias switch

## Erik Reality Check

- `docker` exists on Erik:
  - `/usr/bin/docker`
- `docker buildx` exists:
  - `github.com/docker/buildx v0.33.0`
- no preexisting docker registry login/config found:
  - `~/.docker/config.json` absent

Interpretation:

- Erik can act as a builder
- but cannot yet publish a worker image to GHCR/Docker Hub without credentials or a registry login

## Live Remote Worker Build Attempt

Synced to Erik:

- `/opt/magatama/packages/fine-tuner/Dockerfile.runpod`
- `/opt/magatama/packages/fine-tuner/RUNPOD.md`

Then attempted:

- build image tag:
  - `magatama-runpod-worker:test`

Observed build truth:

- base `runpod/pytorch:2.2.0-py3.10-cuda12.1.1-devel-ubuntu22.04` pulled successfully
- worker dependencies installed successfully
- build progressed through:
  - `COPY train_cuda.py runpod_handler.py ./`
  - `exporting to image`

But:

- the image was not yet visible afterward in `docker images`
- therefore the build still needs one more clean verification pass

## Current Bottleneck

The remaining blocker is no longer MAGATAMA lane logic or adoption code.

It is now:

1. publish the custom worker image to a registry RunPod can consume
2. create/switch the endpoint to that image
3. set on Erik:
   - `RUNPOD_WORKER_KIND=custom-magatama`
   - `RUNPOD_ENDPOINT_ID=<custom endpoint id>`

Only then can MAGATAMA complete the intended full automation:

- training pool refresh
- lane-specific dataset build
- RunPod fine-tune
- returned artifact reference
- local adoption/import
- smoke tests
- new release alias
- active alias switch