sync: record magatama runpod adoption and lane truth
This commit is contained in:
parent
b9a45f9f23
commit
e6f98c89bd
@ -343,6 +343,87 @@ From 2026-04-29:
|
|||||||
- Last price observation: `2026-04-29 19:15:53 UTC`
|
- Last price observation: `2026-04-29 19:15:53 UTC`
|
||||||
- Last stock observation: `2026-04-29 19:15:56 UTC`
|
- Last stock observation: `2026-04-29 19:15:56 UTC`
|
||||||
|
|
||||||
|
## Latest MAGATAMA Training / RunPod Truth
|
||||||
|
|
||||||
|
Confirmed on `2026-05-06`:
|
||||||
|
|
||||||
|
- Lane-specific training pools are now materially separated and no longer all fallback to `magatamallm`.
|
||||||
|
- Live Erik dashboard API now reports:
|
||||||
|
- `magatamallm`
|
||||||
|
- `1367 train`
|
||||||
|
- `152 eval`
|
||||||
|
- `1519 total`
|
||||||
|
- `newSinceLastTraining = 1367`
|
||||||
|
- `fo_blogllm`
|
||||||
|
- `17353 train`
|
||||||
|
- `1929 eval`
|
||||||
|
- `19282 total`
|
||||||
|
- `newSinceLastTraining = 17353`
|
||||||
|
- active local model resolves to `fo-blog-v7`
|
||||||
|
- `tip_llm`
|
||||||
|
- `6482 train`
|
||||||
|
- `721 eval`
|
||||||
|
- `7203 total`
|
||||||
|
- `newSinceLastTraining = 6482`
|
||||||
|
- target active model is `tip-llm-v1`, but this model is not yet present locally in Ollama
|
||||||
|
- Result:
|
||||||
|
- previous `1097` everywhere was stale / wrong.
|
||||||
|
- selected lane now controls its own manifest, model label, and training counts.
|
||||||
|
|
||||||
|
### Gitea-backed Pool Materialization
|
||||||
|
|
||||||
|
- `magatamallm` Gitea pool remains canonical and populated.
|
||||||
|
- `fo_blogllm` and `tip_llm` Gitea-backed pool folders were previously almost empty; they are now materialized from the local RunPod lane exports.
|
||||||
|
- Lane manifests and JSONL exports now exist under:
|
||||||
|
- `training-data/gitea-learning-pool/fo_blogllm/`
|
||||||
|
- `training-data/gitea-learning-pool/tip_llm/`
|
||||||
|
|
||||||
|
### RunPod Completion Hardening
|
||||||
|
|
||||||
|
- MAGATAMA dashboard code now treats RunPod `COMPLETED` as success only after:
|
||||||
|
1. target model artifact is referenced
|
||||||
|
2. local Mac training API adopts/imports the artifact
|
||||||
|
3. lane-specific smoke tests pass
|
||||||
|
4. active Ollama alias is updated
|
||||||
|
- New local adoption endpoint is:
|
||||||
|
- `POST /adopt-runpod-model`
|
||||||
|
|
||||||
|
### Mac Training API State
|
||||||
|
|
||||||
|
- The old LaunchAgent on Mac Studio was still serving the legacy training API from:
|
||||||
|
- `~/magatama-llm/service/training_api.py`
|
||||||
|
- It has now been upgraded in place so Erik sees the new adoption-capable API.
|
||||||
|
- Verified from Erik:
|
||||||
|
- `http://192.168.178.213:3214/health` returns the new service
|
||||||
|
- it now exposes `register_script` pointing into the MAGATAMA repo
|
||||||
|
- `POST /adopt-runpod-model` exists and rejects unauthenticated requests with `401`, proving the route is live
|
||||||
|
|
||||||
|
### Still Outstanding
|
||||||
|
|
||||||
|
- A fully successful end-to-end RunPod fine-tune with:
|
||||||
|
- real worker success
|
||||||
|
- real artifact
|
||||||
|
- successful local Ollama import
|
||||||
|
- active alias switch
|
||||||
|
- smoke-test proof
|
||||||
|
has not yet been re-verified after the new adoption pipeline was wired in.
|
||||||
|
- `tip_llm-v1` is still not installed locally in Ollama.
|
||||||
|
|
||||||
|
### Pulso AI Recommendation
|
||||||
|
|
||||||
|
- Keep a shared network/transceiver/switch core corpus with TIP.
|
||||||
|
- Do not collapse `Pulso AI` into the same instruction lane as `TIP_LLM`.
|
||||||
|
- Recommended split:
|
||||||
|
- `TIP_LLM`
|
||||||
|
- research
|
||||||
|
- crawler / scraper / robot planning
|
||||||
|
- vendor / firmware / issue extraction
|
||||||
|
- `Pulso AI`
|
||||||
|
- product responses
|
||||||
|
- support
|
||||||
|
- diagnostics
|
||||||
|
- operator explanation layer
|
||||||
|
|
||||||
## Safe Next Steps
|
## Safe Next Steps
|
||||||
|
|
||||||
1. Clone or pull Gitea `origin` on laptop/Claude Code.
|
1. Clone or pull Gitea `origin` on laptop/Claude Code.
|
||||||
|
|||||||
@ -0,0 +1,161 @@
|
|||||||
|
# 2026-05-06 — MAGATAMA RunPod Adoption + Lane Truth
|
||||||
|
|
||||||
|
## Scope
|
||||||
|
|
||||||
|
Finalize the MAGATAMA training path so that:
|
||||||
|
|
||||||
|
1. lane-specific pools are real and visible
|
||||||
|
2. RunPod `COMPLETED` is not treated as success without a real adoptable artifact
|
||||||
|
3. Mac Studio exposes a live adoption endpoint for post-RunPod import + smoke tests
|
||||||
|
4. `fo_blogllm` / `tip_llm` stop inheriting stale `magatamallm` counts
|
||||||
|
|
||||||
|
## What Changed
|
||||||
|
|
||||||
|
### 1. Gitea-backed lane pools are now materialized
|
||||||
|
|
||||||
|
The sync/build chain was extended so `fo_blogllm` and `tip_llm` are not “README-only” placeholders anymore.
|
||||||
|
|
||||||
|
Current local lane export truth:
|
||||||
|
|
||||||
|
- `magatamallm`
|
||||||
|
- `1367 train`
|
||||||
|
- `152 eval`
|
||||||
|
- `1519 total`
|
||||||
|
- `fo_blogllm`
|
||||||
|
- `17353 train`
|
||||||
|
- `1929 eval`
|
||||||
|
- `19282 total`
|
||||||
|
- `tip_llm`
|
||||||
|
- `6482 train`
|
||||||
|
- `721 eval`
|
||||||
|
- `7203 total`
|
||||||
|
|
||||||
|
`sync_gitea_training_pool.ts` now writes lane-specific manifests and JSONL exports back into the Gitea-backed learning-pool tree.
|
||||||
|
|
||||||
|
### 2. RunPod completion gating was hardened
|
||||||
|
|
||||||
|
The dashboard/server path was updated so RunPod `COMPLETED` is no longer enough by itself.
|
||||||
|
|
||||||
|
The intended success chain is now:
|
||||||
|
|
||||||
|
1. RunPod reports terminal state
|
||||||
|
2. target model artifact is identified
|
||||||
|
3. Mac Studio `/adopt-runpod-model` is called
|
||||||
|
4. local candidate model is imported into Ollama
|
||||||
|
5. lane-specific smoke suite passes
|
||||||
|
6. active alias is switched
|
||||||
|
7. only then is the run treated as truly successful
|
||||||
|
|
||||||
|
Registry status extensions added:
|
||||||
|
|
||||||
|
- `completed_and_adopted`
|
||||||
|
- `completed_seed_preparation`
|
||||||
|
- `completed_not_adopted`
|
||||||
|
|
||||||
|
### 3. Mac Studio training API was upgraded in place
|
||||||
|
|
||||||
|
Critical discovery:
|
||||||
|
|
||||||
|
- Erik already used `http://192.168.178.213:3214`
|
||||||
|
- but the Mac LaunchAgent still served the **old** training API from:
|
||||||
|
- `~/magatama-llm/service/training_api.py`
|
||||||
|
|
||||||
|
This old service had no `/adopt-runpod-model`.
|
||||||
|
|
||||||
|
Action taken:
|
||||||
|
|
||||||
|
- upgraded the LaunchAgent-targeted file in place
|
||||||
|
- made the training API portable enough to find `register_runpod_ollama_model.py` from either:
|
||||||
|
- the MAGATAMA repo
|
||||||
|
- or fallback candidate paths
|
||||||
|
|
||||||
|
Verified from Erik:
|
||||||
|
|
||||||
|
- `GET /health` works
|
||||||
|
- response now contains:
|
||||||
|
- `register_script: /Users/renefichtmueller/Desktop/Claude Code/magatama/scripts/register_runpod_ollama_model.py`
|
||||||
|
- `POST /adopt-runpod-model` exists and returns `401` without auth, which proves the new route is live
|
||||||
|
|
||||||
|
### 4. Lane status is now honest
|
||||||
|
|
||||||
|
The live Erik dashboard API now reports lane-specific values instead of silently reusing `magatamallm`.
|
||||||
|
|
||||||
|
Also fixed:
|
||||||
|
|
||||||
|
- `fo_blogllm` and `tip_llm` no longer inherit a false “last successful run” from the global Mac training state
|
||||||
|
- lane-specific active model labels are now used:
|
||||||
|
- `fo_blogllm` -> `fo-blog-v7`
|
||||||
|
- `tip_llm` -> `tip-llm-v1`
|
||||||
|
|
||||||
|
## Verified Live State on Erik
|
||||||
|
|
||||||
|
### `magatamallm`
|
||||||
|
|
||||||
|
- available: `true`
|
||||||
|
- activeProvider: `ollama:magatama-coder:latest`
|
||||||
|
- `newSinceLastTraining = 1367`
|
||||||
|
|
||||||
|
### `fo_blogllm`
|
||||||
|
|
||||||
|
- available: `true`
|
||||||
|
- activeProvider: `ollama:fo-blog-v7`
|
||||||
|
- `newSinceLastTraining = 17353`
|
||||||
|
- `lastTrainingAt = null`
|
||||||
|
- `neverTrained = true`
|
||||||
|
|
||||||
|
### `tip_llm`
|
||||||
|
|
||||||
|
- available: `false`
|
||||||
|
- activeProvider falls back to `claude-bridge`
|
||||||
|
- target model shown as `tip-llm-v1`
|
||||||
|
- `newSinceLastTraining = 6482`
|
||||||
|
- `lastTrainingAt = null`
|
||||||
|
- `neverTrained = true`
|
||||||
|
|
||||||
|
Interpretation:
|
||||||
|
|
||||||
|
- `tip_llm` corpus is real, but the active Ollama alias is not installed locally yet.
|
||||||
|
|
||||||
|
## Pulso AI Decision
|
||||||
|
|
||||||
|
Recommended architecture:
|
||||||
|
|
||||||
|
- shared network / transceiver / switch knowledge core with TIP
|
||||||
|
- separate behavior lane
|
||||||
|
|
||||||
|
Meaning:
|
||||||
|
|
||||||
|
- `TIP_LLM`
|
||||||
|
- research
|
||||||
|
- crawler planning
|
||||||
|
- issue extraction
|
||||||
|
- vendor / firmware / compatibility search
|
||||||
|
- `Pulso AI`
|
||||||
|
- support
|
||||||
|
- diagnostics
|
||||||
|
- operational explanation
|
||||||
|
- customer/product answer layer
|
||||||
|
|
||||||
|
Do **not** blindly reuse the exact same instruction lane for both.
|
||||||
|
|
||||||
|
## Still Open
|
||||||
|
|
||||||
|
1. A fresh real RunPod run still needs full end-to-end proof after the new adoption path:
|
||||||
|
- successful worker execution
|
||||||
|
- artifact exists
|
||||||
|
- local import succeeds
|
||||||
|
- smoke suite passes
|
||||||
|
- alias switches
|
||||||
|
2. `tip_llm-v1` still needs local Ollama adoption/installation.
|
||||||
|
3. Further corpus enrichment is still desirable from:
|
||||||
|
- local transceiver/TIP/blog pools
|
||||||
|
- Susan/Fearghas if accessible
|
||||||
|
- GitHub / universities / security research sources
|
||||||
|
|
||||||
|
## Operator Notes
|
||||||
|
|
||||||
|
- TIP policy remains:
|
||||||
|
- TIPLLM-only for robot/crawler planning
|
||||||
|
- Erik is light controller only
|
||||||
|
- heavy crawling runs on Proxmox / Pis
|
||||||
|
- Push only `sync/` to Gitea from this handoff update.
|
||||||
Loading…
x
Reference in New Issue
Block a user