From e6f98c89bda832ad126d7a546ec003a4c87a0778 Mon Sep 17 00:00:00 2001 From: Rene Fichtmueller Date: Wed, 6 May 2026 20:23:53 +0200 Subject: [PATCH] sync: record magatama runpod adoption and lane truth --- sync/CURRENT.md | 81 +++++++++ ...magatama-runpod-adoption-and-lane-truth.md | 161 ++++++++++++++++++ 2 files changed, 242 insertions(+) create mode 100644 sync/history/2026-05-06-magatama-runpod-adoption-and-lane-truth.md diff --git a/sync/CURRENT.md b/sync/CURRENT.md index df38802..012fb51 100644 --- a/sync/CURRENT.md +++ b/sync/CURRENT.md @@ -343,6 +343,87 @@ From 2026-04-29: - Last price observation: `2026-04-29 19:15:53 UTC` - Last stock observation: `2026-04-29 19:15:56 UTC` +## Latest MAGATAMA Training / RunPod Truth + +Confirmed on `2026-05-06`: + +- Lane-specific training pools are now materially separated and no longer all fallback to `magatamallm`. +- Live Erik dashboard API now reports: + - `magatamallm` + - `1367 train` + - `152 eval` + - `1519 total` + - `newSinceLastTraining = 1367` + - `fo_blogllm` + - `17353 train` + - `1929 eval` + - `19282 total` + - `newSinceLastTraining = 17353` + - active local model resolves to `fo-blog-v7` + - `tip_llm` + - `6482 train` + - `721 eval` + - `7203 total` + - `newSinceLastTraining = 6482` + - target active model is `tip-llm-v1`, but this model is not yet present locally in Ollama +- Result: + - previous `1097` everywhere was stale / wrong. + - selected lane now controls its own manifest, model label, and training counts. + +### Gitea-backed Pool Materialization + +- `magatamallm` Gitea pool remains canonical and populated. +- `fo_blogllm` and `tip_llm` Gitea-backed pool folders were previously almost empty; they are now materialized from the local RunPod lane exports. +- Lane manifests and JSONL exports now exist under: + - `training-data/gitea-learning-pool/fo_blogllm/` + - `training-data/gitea-learning-pool/tip_llm/` + +### RunPod Completion Hardening + +- MAGATAMA dashboard code now treats RunPod `COMPLETED` as success only after: + 1. target model artifact is referenced + 2. local Mac training API adopts/imports the artifact + 3. lane-specific smoke tests pass + 4. active Ollama alias is updated +- New local adoption endpoint is: + - `POST /adopt-runpod-model` + +### Mac Training API State + +- The old LaunchAgent on Mac Studio was still serving the legacy training API from: + - `~/magatama-llm/service/training_api.py` +- It has now been upgraded in place so Erik sees the new adoption-capable API. +- Verified from Erik: + - `http://192.168.178.213:3214/health` returns the new service + - it now exposes `register_script` pointing into the MAGATAMA repo + - `POST /adopt-runpod-model` exists and rejects unauthenticated requests with `401`, proving the route is live + +### Still Outstanding + +- A fully successful end-to-end RunPod fine-tune with: + - real worker success + - real artifact + - successful local Ollama import + - active alias switch + - smoke-test proof + has not yet been re-verified after the new adoption pipeline was wired in. +- `tip_llm-v1` is still not installed locally in Ollama. + +### Pulso AI Recommendation + +- Keep a shared network/transceiver/switch core corpus with TIP. +- Do not collapse `Pulso AI` into the same instruction lane as `TIP_LLM`. +- Recommended split: + - `TIP_LLM` + - research + - crawler / scraper / robot planning + - vendor / firmware / issue extraction + - `Pulso AI` + - product responses + - support + - diagnostics + - operator explanation layer + ## Safe Next Steps 1. Clone or pull Gitea `origin` on laptop/Claude Code. diff --git a/sync/history/2026-05-06-magatama-runpod-adoption-and-lane-truth.md b/sync/history/2026-05-06-magatama-runpod-adoption-and-lane-truth.md new file mode 100644 index 0000000..b433f1e --- /dev/null +++ b/sync/history/2026-05-06-magatama-runpod-adoption-and-lane-truth.md @@ -0,0 +1,161 @@ +# 2026-05-06 — MAGATAMA RunPod Adoption + Lane Truth + +## Scope + +Finalize the MAGATAMA training path so that: + +1. lane-specific pools are real and visible +2. RunPod `COMPLETED` is not treated as success without a real adoptable artifact +3. Mac Studio exposes a live adoption endpoint for post-RunPod import + smoke tests +4. `fo_blogllm` / `tip_llm` stop inheriting stale `magatamallm` counts + +## What Changed + +### 1. Gitea-backed lane pools are now materialized + +The sync/build chain was extended so `fo_blogllm` and `tip_llm` are not “README-only” placeholders anymore. + +Current local lane export truth: + +- `magatamallm` + - `1367 train` + - `152 eval` + - `1519 total` +- `fo_blogllm` + - `17353 train` + - `1929 eval` + - `19282 total` +- `tip_llm` + - `6482 train` + - `721 eval` + - `7203 total` + +`sync_gitea_training_pool.ts` now writes lane-specific manifests and JSONL exports back into the Gitea-backed learning-pool tree. + +### 2. RunPod completion gating was hardened + +The dashboard/server path was updated so RunPod `COMPLETED` is no longer enough by itself. + +The intended success chain is now: + +1. RunPod reports terminal state +2. target model artifact is identified +3. Mac Studio `/adopt-runpod-model` is called +4. local candidate model is imported into Ollama +5. lane-specific smoke suite passes +6. active alias is switched +7. only then is the run treated as truly successful + +Registry status extensions added: + +- `completed_and_adopted` +- `completed_seed_preparation` +- `completed_not_adopted` + +### 3. Mac Studio training API was upgraded in place + +Critical discovery: + +- Erik already used `http://192.168.178.213:3214` +- but the Mac LaunchAgent still served the **old** training API from: + - `~/magatama-llm/service/training_api.py` + +This old service had no `/adopt-runpod-model`. + +Action taken: + +- upgraded the LaunchAgent-targeted file in place +- made the training API portable enough to find `register_runpod_ollama_model.py` from either: + - the MAGATAMA repo + - or fallback candidate paths + +Verified from Erik: + +- `GET /health` works +- response now contains: + - `register_script: /Users/renefichtmueller/Desktop/Claude Code/magatama/scripts/register_runpod_ollama_model.py` +- `POST /adopt-runpod-model` exists and returns `401` without auth, which proves the new route is live + +### 4. Lane status is now honest + +The live Erik dashboard API now reports lane-specific values instead of silently reusing `magatamallm`. + +Also fixed: + +- `fo_blogllm` and `tip_llm` no longer inherit a false “last successful run” from the global Mac training state +- lane-specific active model labels are now used: + - `fo_blogllm` -> `fo-blog-v7` + - `tip_llm` -> `tip-llm-v1` + +## Verified Live State on Erik + +### `magatamallm` + +- available: `true` +- activeProvider: `ollama:magatama-coder:latest` +- `newSinceLastTraining = 1367` + +### `fo_blogllm` + +- available: `true` +- activeProvider: `ollama:fo-blog-v7` +- `newSinceLastTraining = 17353` +- `lastTrainingAt = null` +- `neverTrained = true` + +### `tip_llm` + +- available: `false` +- activeProvider falls back to `claude-bridge` +- target model shown as `tip-llm-v1` +- `newSinceLastTraining = 6482` +- `lastTrainingAt = null` +- `neverTrained = true` + +Interpretation: + +- `tip_llm` corpus is real, but the active Ollama alias is not installed locally yet. + +## Pulso AI Decision + +Recommended architecture: + +- shared network / transceiver / switch knowledge core with TIP +- separate behavior lane + +Meaning: + +- `TIP_LLM` + - research + - crawler planning + - issue extraction + - vendor / firmware / compatibility search +- `Pulso AI` + - support + - diagnostics + - operational explanation + - customer/product answer layer + +Do **not** blindly reuse the exact same instruction lane for both. + +## Still Open + +1. A fresh real RunPod run still needs full end-to-end proof after the new adoption path: + - successful worker execution + - artifact exists + - local import succeeds + - smoke suite passes + - alias switches +2. `tip_llm-v1` still needs local Ollama adoption/installation. +3. Further corpus enrichment is still desirable from: + - local transceiver/TIP/blog pools + - Susan/Fearghas if accessible + - GitHub / universities / security research sources + +## Operator Notes + +- TIP policy remains: + - TIPLLM-only for robot/crawler planning + - Erik is light controller only + - heavy crawling runs on Proxmox / Pis +- Push only `sync/` to Gitea from this handoff update.