Compare commits
No commits in common. "01d0365fbf041e120bbfddee1e7f3b379f1593eb" and "a0ea4ccbae239cd1861f93703ae530f59bf8d589" have entirely different histories.
01d0365fbf
...
a0ea4ccbae
159
sync/CURRENT.md
159
sync/CURRENT.md
@ -1,6 +1,6 @@
|
|||||||
# Current TIP Sync State
|
# Current TIP Sync State
|
||||||
|
|
||||||
Updated: 2026-05-07 02:58 UTC
|
Updated: 2026-05-06 22:55 UTC
|
||||||
|
|
||||||
## Active Policy
|
## Active Policy
|
||||||
|
|
||||||
@ -27,163 +27,6 @@ When work touches TIP, Magatama, LLM Gateway, bridges, auth, or shared Erik infr
|
|||||||
|
|
||||||
## Latest Work
|
## Latest Work
|
||||||
|
|
||||||
- MAGATAMA live follow-up on 2026-05-07:
|
|
||||||
- local Mac training API was rechecked after the lane-specific automation changes.
|
|
||||||
- current live truth:
|
|
||||||
- LaunchAgent `org.fichtmueller.magatama-train-api` is present and running
|
|
||||||
- process listens on `*:3214`
|
|
||||||
- localhost health now responds when checked outside sandbox restrictions:
|
|
||||||
- `GET http://127.0.0.1:3214/health`
|
|
||||||
- response:
|
|
||||||
- `status = ok`
|
|
||||||
- `service = magatama-train-api`
|
|
||||||
- `running = false`
|
|
||||||
- `pid = null`
|
|
||||||
- `updated_at = 2026-05-07T04:14:23Z`
|
|
||||||
- interpretation:
|
|
||||||
- the training API itself is healthy and reachable
|
|
||||||
- it is currently idle, not broken
|
|
||||||
- the actual next proof point must come from a fresh lane run that writes lane-specific `*-last_run.json`
|
|
||||||
- live Attack Paths UI bug was fixed and deployed to Erik:
|
|
||||||
- root cause:
|
|
||||||
- the `Open Fix Guidance` button inside the attack-path side panel only triggered a dummy toast and never opened a real finding/ticket detail
|
|
||||||
- fix:
|
|
||||||
- `magatama/packages/dashboard/public/index-v2.html`
|
|
||||||
- new helper:
|
|
||||||
- `openFixGuidanceForNode(nodeId)`
|
|
||||||
- behavior:
|
|
||||||
- if the clicked graph node maps to a real finding ID, MAGATAMA now opens the existing ticket/finding detail drawer via `openTicket(id)`
|
|
||||||
- if the node is only a synthetic path node with no backing finding, MAGATAMA now shows an explicit warning instead of pretending to open guidance
|
|
||||||
- live deployment:
|
|
||||||
- updated `index-v2.html` was rsynced to:
|
|
||||||
- `/opt/magatama/packages/dashboard/public/index-v2.html`
|
|
||||||
- `pm2 restart magatama-dashboard` executed on Erik
|
|
||||||
- deployed file on Erik verified with:
|
|
||||||
- `openFixGuidanceForNode`
|
|
||||||
- `Open Fix Guidance`
|
|
||||||
- operator consequence:
|
|
||||||
- Attack Paths no longer contain a placebo “Open Fix Guidance” action
|
|
||||||
- clicking it should now open the actual MAGATAMA finding/ticket guidance path when the graph node represents a real finding
|
|
||||||
|
|
||||||
- MAGATAMA training automation was hardened locally on 2026-05-07 for all three lanes:
|
|
||||||
- target lanes:
|
|
||||||
- `magatamallm`
|
|
||||||
- `fo_blogllm`
|
|
||||||
- `tip_llm`
|
|
||||||
- core root cause confirmed:
|
|
||||||
- RunPod dataset refresh / lane export already worked
|
|
||||||
- RunPod jobs often reached `COMPLETED`
|
|
||||||
- but model adoption/version truth still depended on a single shared:
|
|
||||||
- `~/magatama-llm/fine-tuning/last_run.json`
|
|
||||||
- this made lane status and successful return/adoption ambiguous across models
|
|
||||||
- the training modal could also collapse late stream/adoption failures into a generic `network error`
|
|
||||||
- local code fixes now in place:
|
|
||||||
- `magatama/packages/fine-tuner/training_api.py`
|
|
||||||
- lane-specific last-run files added:
|
|
||||||
- `~/magatama-llm/fine-tuning/magatamallm-last_run.json`
|
|
||||||
- `~/magatama-llm/fine-tuning/fo_blogllm-last_run.json`
|
|
||||||
- `~/magatama-llm/fine-tuning/tip_llm-last_run.json`
|
|
||||||
- legacy `last_run.json` remains only as backward-compatible mirror for `magatamallm`
|
|
||||||
- successful RunPod adoption now creates:
|
|
||||||
- a release alias per lane, e.g. `<active-alias>-rN`
|
|
||||||
- active alias switching sequence is now:
|
|
||||||
- candidate model imported
|
|
||||||
- smoke-tested
|
|
||||||
- release alias created
|
|
||||||
- stable active alias repointed to that release alias
|
|
||||||
- adoption report now includes:
|
|
||||||
- `version_counter`
|
|
||||||
- `release_alias`
|
|
||||||
- `magatama/packages/fine-tuner/train.py`
|
|
||||||
- local metrics writing now also respects lane-specific last-run files via `TRAINING_LANE`
|
|
||||||
- `magatama/packages/dashboard/src/server.ts`
|
|
||||||
- `/api/llm/status` now reads lane-specific last-run metadata first
|
|
||||||
- `release_alias` is preferred as visible model version when present
|
|
||||||
- RunPod SSE catch now distinguishes:
|
|
||||||
- real generic training failure
|
|
||||||
- `COMPLETED` but no artifact / failed adoption
|
|
||||||
- the latter is now rendered as a truthful return/adoption failure, not a vague dataset/network issue
|
|
||||||
- `magatama/packages/dashboard/public/index-v2.html`
|
|
||||||
- training modal now suppresses misleading late generic `network error` if the server already emitted a terminal training status
|
|
||||||
- if the stream ends without a final terminal server event, the UI now explicitly says the registry/adoption state must be checked
|
|
||||||
- if the backend reports:
|
|
||||||
- completed without artifact
|
|
||||||
- completed without HF model
|
|
||||||
- completed but adoption failed
|
|
||||||
the modal now shows that exact reason
|
|
||||||
- local verification:
|
|
||||||
- `python3 -m py_compile` passed for:
|
|
||||||
- `training_api.py`
|
|
||||||
- `train.py`
|
|
||||||
- dashboard build passed:
|
|
||||||
- `pnpm -C packages/dashboard build`
|
|
||||||
- current operational blocker:
|
|
||||||
- live deployment to Erik was **not yet completed in this step**
|
|
||||||
- direct SSH checks returned:
|
|
||||||
- `Connection refused`
|
|
||||||
- then `Operation timed out`
|
|
||||||
- because of that, the new lane-specific automation logic is locally ready, but not yet confirmed live on Erik for the currently running:
|
|
||||||
- `tip_llm`
|
|
||||||
- `fo_blogllm`
|
|
||||||
- practical consequence:
|
|
||||||
- the code path is now prepared for full automation:
|
|
||||||
- pull from lane-specific training pool
|
|
||||||
- train on RunPod
|
|
||||||
- verify artifact existence
|
|
||||||
- adopt locally
|
|
||||||
- create new release alias/version
|
|
||||||
- repoint stable active alias
|
|
||||||
- show truthful status in UI
|
|
||||||
- but the current live Erik run still needs redeploy + verification once SSH is reachable again
|
|
||||||
|
|
||||||
- MAGATAMA local MagatamaLLM training state was re-verified on 2026-05-07:
|
|
||||||
- result:
|
|
||||||
- the lane export / dataset refresh worked
|
|
||||||
- a new locally adopted MagatamaLLM model did **not** land
|
|
||||||
- active MAGATAMA provider remains the older alias:
|
|
||||||
- `ollama:magatama-coder:latest`
|
|
||||||
- live/public evidence:
|
|
||||||
- `GET https://magatama.fichtmueller.org/api/llm/status`
|
|
||||||
- `activeProvider = ollama:magatama-coder:latest`
|
|
||||||
- `autoFixProvider = ollama:magatama-coder:latest`
|
|
||||||
- `training.lastTrainingAt = 2026-05-06T22:43:20Z`
|
|
||||||
- `training.modelVersion = magatama-coder:latest`
|
|
||||||
- `training.activeRun = null`
|
|
||||||
- this means the UI timestamp currently reflects the latest dataset/training-state update, not proof of a newly adopted local model.
|
|
||||||
- local Mac evidence:
|
|
||||||
- `ollama list` still shows:
|
|
||||||
- `magatama-coder:latest` → modified `3 weeks ago`
|
|
||||||
- `magatama-llm-v2-0:latest` → modified `11 days ago`
|
|
||||||
- no newer Magatama candidate/import alias appeared locally
|
|
||||||
- registry/adoption evidence:
|
|
||||||
- Erik lane manifest exists and is fresh:
|
|
||||||
- `/opt/magatama/training-data/runpod/magatamallm/manifest.json`
|
|
||||||
- `generatedAt = 2026-05-06T22:45:15.944Z`
|
|
||||||
- `train = 15679`
|
|
||||||
- `eval = 1743`
|
|
||||||
- `total = 17422`
|
|
||||||
- but Erik had no populated local adoption/registry state files in:
|
|
||||||
- `/opt/magatama/training-data/model-registry/models.json`
|
|
||||||
- `/opt/magatama/training-data/model-registry/runs.json`
|
|
||||||
- `/opt/magatama/training-data/model-registry/active.json`
|
|
||||||
- `/opt/magatama/data/llm-status.json`
|
|
||||||
- local repo only had historical `training-data/model-registry/training-runs.json`
|
|
||||||
- historical run evidence:
|
|
||||||
- recent `magatamallm` training-run records still show:
|
|
||||||
- `submitted`
|
|
||||||
- then `not_found_after_submit`
|
|
||||||
- or other non-adopted / worker-failure states
|
|
||||||
- there is still no verified “completed_and_adopted” proof for a new MagatamaLLM local model.
|
|
||||||
- operational conclusion:
|
|
||||||
- current truth:
|
|
||||||
- dataset/lane preparation works
|
|
||||||
- local model adoption is still the missing step
|
|
||||||
- MAGATAMA does **not** currently know more than the already active `magatama-coder:latest` alias
|
|
||||||
- next fix block remains:
|
|
||||||
- make RunPod/local completion count only when adoption succeeds
|
|
||||||
- persist adoption report + model registry state
|
|
||||||
- update active alias and version only after smoke-tested import succeeds
|
|
||||||
|
|
||||||
- MAGATAMA Switchblade port intelligence is now truly flowing end-to-end on 2026-05-06:
|
- MAGATAMA Switchblade port intelligence is now truly flowing end-to-end on 2026-05-06:
|
||||||
- live root cause:
|
- live root cause:
|
||||||
- Switchblade itself already had the rich SG350 data (`description`, LLDP neighbor, peer port, octets), but MAGATAMA had still shown mostly flat port chips.
|
- Switchblade itself already had the rich SG350 data (`description`, LLDP neighbor, peer port, octets), but MAGATAMA had still shown mostly flat port chips.
|
||||||
|
|||||||
@ -1,76 +0,0 @@
|
|||||||
# MAGATAMA Attack-Path Fix Guidance Live Deploy
|
|
||||||
|
|
||||||
Date: 2026-05-07 UTC
|
|
||||||
|
|
||||||
## Scope
|
|
||||||
|
|
||||||
- MAGATAMA attack-path side panel
|
|
||||||
- local Mac training API reachability/truth check
|
|
||||||
|
|
||||||
## Findings
|
|
||||||
|
|
||||||
### 1. `Open Fix Guidance` was a placebo button
|
|
||||||
|
|
||||||
The Attack Paths detail sidebar rendered a real CTA labeled `Open Fix Guidance`, but the click handler only executed:
|
|
||||||
|
|
||||||
- `toast('Fix guidance opened','info')`
|
|
||||||
|
|
||||||
No real drawer, ticket, or finding guidance path opened from that action.
|
|
||||||
|
|
||||||
### 2. Local training API was not dead; it was just idle
|
|
||||||
|
|
||||||
The local training API service for MAGATAMA lane automation is managed by:
|
|
||||||
|
|
||||||
- `org.fichtmueller.magatama-train-api`
|
|
||||||
|
|
||||||
Live checks showed:
|
|
||||||
|
|
||||||
- LaunchAgent state: running
|
|
||||||
- port listener on `*:3214`
|
|
||||||
- health response on localhost when checked outside sandbox restrictions:
|
|
||||||
- `status = ok`
|
|
||||||
- `service = magatama-train-api`
|
|
||||||
- `running = false`
|
|
||||||
- `pid = null`
|
|
||||||
|
|
||||||
Interpretation:
|
|
||||||
|
|
||||||
- the API process is healthy and reachable
|
|
||||||
- it is currently idle between runs
|
|
||||||
- the remaining proof point for automation is a fresh lane training run that writes back lane-specific run metadata and completes local adoption/version switching
|
|
||||||
|
|
||||||
## Fix Applied
|
|
||||||
|
|
||||||
File:
|
|
||||||
|
|
||||||
- `magatama/packages/dashboard/public/index-v2.html`
|
|
||||||
|
|
||||||
Changes:
|
|
||||||
|
|
||||||
- added `openFixGuidanceForNode(nodeId)`
|
|
||||||
- `showNodeDetail(n)` now wires the CTA to the new helper instead of a toast
|
|
||||||
- if the graph node maps to a real finding:
|
|
||||||
- MAGATAMA opens the existing finding/ticket detail via `openTicket(id)`
|
|
||||||
- if the node is synthetic and has no backing finding:
|
|
||||||
- MAGATAMA now shows a clear warning toast instead of pretending guidance opened
|
|
||||||
|
|
||||||
## Live Deployment
|
|
||||||
|
|
||||||
Updated file copied to Erik:
|
|
||||||
|
|
||||||
- `/opt/magatama/packages/dashboard/public/index-v2.html`
|
|
||||||
|
|
||||||
Dashboard restarted:
|
|
||||||
|
|
||||||
- `pm2 restart magatama-dashboard`
|
|
||||||
|
|
||||||
Remote file verification confirmed presence of:
|
|
||||||
|
|
||||||
- `openFixGuidanceForNode`
|
|
||||||
- `Open Fix Guidance`
|
|
||||||
|
|
||||||
## Operational Result
|
|
||||||
|
|
||||||
- Attack Paths no longer expose a fake remediation CTA
|
|
||||||
- the CTA now routes into the actual MAGATAMA guidance/detail path when the node represents a real finding
|
|
||||||
- local training API health is confirmed, but lane-specific successful return/adoption still needs validation with a fresh real training run
|
|
||||||
@ -1,170 +0,0 @@
|
|||||||
# MAGATAMA Lane-Specific RunPod Adoption + Versioning
|
|
||||||
|
|
||||||
Date: 2026-05-07
|
|
||||||
|
|
||||||
## Scope
|
|
||||||
|
|
||||||
Harden MAGATAMA training automation for:
|
|
||||||
|
|
||||||
- `magatamallm`
|
|
||||||
- `fo_blogllm`
|
|
||||||
- `tip_llm`
|
|
||||||
|
|
||||||
Goal:
|
|
||||||
|
|
||||||
- lane-specific training pools remain isolated
|
|
||||||
- RunPod `COMPLETED` counts only when model return/adoption is real
|
|
||||||
- active lane model gets a new release/version marker after successful adoption
|
|
||||||
- dashboard status and errors remain truthful
|
|
||||||
|
|
||||||
## Problem
|
|
||||||
|
|
||||||
The data/build side of training already worked:
|
|
||||||
|
|
||||||
- lane-specific RunPod datasets were built
|
|
||||||
- RunPod jobs were submitted
|
|
||||||
- registry often showed `IN_PROGRESS` / `COMPLETED`
|
|
||||||
|
|
||||||
But the end of the chain remained weak:
|
|
||||||
|
|
||||||
1. adoption/version truth still depended on one shared:
|
|
||||||
- `~/magatama-llm/fine-tuning/last_run.json`
|
|
||||||
2. multiple lanes could therefore overwrite the same success marker
|
|
||||||
3. the modal could degrade late-stream adoption failures into a generic `network error`
|
|
||||||
4. the user requirement was stricter:
|
|
||||||
- training pool -> RunPod -> artifact -> local import -> version bump -> active alias switch
|
|
||||||
- all fully automatic
|
|
||||||
|
|
||||||
## Code changes made locally
|
|
||||||
|
|
||||||
### 1. Lane-specific last-run metadata
|
|
||||||
|
|
||||||
File:
|
|
||||||
|
|
||||||
- `magatama/packages/fine-tuner/training_api.py`
|
|
||||||
|
|
||||||
Added:
|
|
||||||
|
|
||||||
- `lane_last_run_file(lane)`
|
|
||||||
|
|
||||||
Resulting files:
|
|
||||||
|
|
||||||
- `~/magatama-llm/fine-tuning/magatamallm-last_run.json`
|
|
||||||
- `~/magatama-llm/fine-tuning/fo_blogllm-last_run.json`
|
|
||||||
- `~/magatama-llm/fine-tuning/tip_llm-last_run.json`
|
|
||||||
|
|
||||||
Compatibility:
|
|
||||||
|
|
||||||
- `magatamallm` still mirrors to legacy:
|
|
||||||
- `~/magatama-llm/fine-tuning/last_run.json`
|
|
||||||
|
|
||||||
### 2. Automatic release alias / version step
|
|
||||||
|
|
||||||
File:
|
|
||||||
|
|
||||||
- `magatama/packages/fine-tuner/training_api.py`
|
|
||||||
|
|
||||||
Added:
|
|
||||||
|
|
||||||
- `next_release_metadata(lane, active_model)`
|
|
||||||
- release alias creation
|
|
||||||
|
|
||||||
New adoption sequence:
|
|
||||||
|
|
||||||
1. RunPod artifact imported to candidate model
|
|
||||||
2. candidate smoke tests pass
|
|
||||||
3. release alias is created:
|
|
||||||
- example shape: `<active-alias>-rN`
|
|
||||||
4. stable active alias is repointed to that release alias
|
|
||||||
|
|
||||||
This means the lane now receives a concrete new release/version marker after successful adoption.
|
|
||||||
|
|
||||||
### 3. Dashboard lane status truth
|
|
||||||
|
|
||||||
File:
|
|
||||||
|
|
||||||
- `magatama/packages/dashboard/src/server.ts`
|
|
||||||
|
|
||||||
Changed:
|
|
||||||
|
|
||||||
- `/api/llm/status` now reads lane-specific last-run metadata first
|
|
||||||
- `release_alias` is preferred as visible model version
|
|
||||||
- this prevents one lane from falsely inheriting another lane's last successful run marker
|
|
||||||
|
|
||||||
### 4. Truthful RunPod terminal failure messaging
|
|
||||||
|
|
||||||
Files:
|
|
||||||
|
|
||||||
- `magatama/packages/dashboard/src/server.ts`
|
|
||||||
- `magatama/packages/dashboard/public/index-v2.html`
|
|
||||||
|
|
||||||
Changed:
|
|
||||||
|
|
||||||
- if RunPod says `COMPLETED` but:
|
|
||||||
- no model artifact exists
|
|
||||||
- no HF repo appears
|
|
||||||
- adoption fails
|
|
||||||
|
|
||||||
the UI now reports that exact reason instead of collapsing into a vague generic failure
|
|
||||||
|
|
||||||
Frontend hardening:
|
|
||||||
|
|
||||||
- avoid showing a misleading late `network error` after the server already emitted a terminal training event
|
|
||||||
- if the stream dies without a terminal event, the modal says so explicitly
|
|
||||||
|
|
||||||
### 5. Local training metrics future-proofed
|
|
||||||
|
|
||||||
File:
|
|
||||||
|
|
||||||
- `magatama/packages/fine-tuner/train.py`
|
|
||||||
|
|
||||||
Changed:
|
|
||||||
|
|
||||||
- metrics now also respect lane-specific last-run files via `TRAINING_LANE`
|
|
||||||
|
|
||||||
## Local verification
|
|
||||||
|
|
||||||
Passed:
|
|
||||||
|
|
||||||
- `python3 -m py_compile .../training_api.py .../train.py`
|
|
||||||
- `pnpm -C .../packages/dashboard build`
|
|
||||||
|
|
||||||
## Live deployment state
|
|
||||||
|
|
||||||
Not yet completed in this step.
|
|
||||||
|
|
||||||
Reason:
|
|
||||||
|
|
||||||
- direct Erik access failed during this block:
|
|
||||||
- `ssh: connect to host 82.165.222.127 port 22: Connection refused`
|
|
||||||
- later also `Operation timed out`
|
|
||||||
|
|
||||||
Therefore:
|
|
||||||
|
|
||||||
- the automation fix is locally ready
|
|
||||||
- but not yet verified live against the currently running:
|
|
||||||
- `tip_llm`
|
|
||||||
- `fo_blogllm`
|
|
||||||
|
|
||||||
## Operational next step
|
|
||||||
|
|
||||||
Once Erik SSH is reachable again:
|
|
||||||
|
|
||||||
1. deploy updated:
|
|
||||||
- `training_api.py`
|
|
||||||
- `train.py`
|
|
||||||
- dashboard build / server bundle
|
|
||||||
2. restart:
|
|
||||||
- `magatama-dashboard`
|
|
||||||
- Mac-side training API if used
|
|
||||||
3. verify lane-specific status:
|
|
||||||
- `tip_llm`
|
|
||||||
- `fo_blogllm`
|
|
||||||
- `magatamallm`
|
|
||||||
4. verify that a successful RunPod training now results in:
|
|
||||||
- artifact found
|
|
||||||
- adoption report present
|
|
||||||
- lane-specific `*-last_run.json`
|
|
||||||
- release alias incremented
|
|
||||||
- stable alias repointed
|
|
||||||
|
|
||||||
@ -1,94 +0,0 @@
|
|||||||
# 2026-05-07 – MagatamaLLM Local Training Verification
|
|
||||||
|
|
||||||
## Question
|
|
||||||
|
|
||||||
Did the recent local / MAGATAMA-side MagatamaLLM training actually succeed and increase the active model’s knowledge?
|
|
||||||
|
|
||||||
## Answer
|
|
||||||
|
|
||||||
No. The dataset refresh succeeded, but a newer locally adopted MagatamaLLM model was **not** verified.
|
|
||||||
|
|
||||||
## Evidence
|
|
||||||
|
|
||||||
### 1. Public MAGATAMA status
|
|
||||||
|
|
||||||
`GET https://magatama.fichtmueller.org/api/llm/status`
|
|
||||||
|
|
||||||
Observed:
|
|
||||||
- `activeProvider = ollama:magatama-coder:latest`
|
|
||||||
- `autoFixProvider = ollama:magatama-coder:latest`
|
|
||||||
- `training.lastTrainingAt = 2026-05-06T22:43:20Z`
|
|
||||||
- `training.modelVersion = magatama-coder:latest`
|
|
||||||
- `training.activeRun = null`
|
|
||||||
|
|
||||||
Interpretation:
|
|
||||||
- the dashboard timestamp reflects the latest dataset/training-state update
|
|
||||||
- it does **not** prove that a new local model was imported and activated
|
|
||||||
|
|
||||||
### 2. Local Ollama state on the Mac
|
|
||||||
|
|
||||||
`ollama list`
|
|
||||||
|
|
||||||
Relevant entries:
|
|
||||||
- `magatama-coder:latest` → modified `3 weeks ago`
|
|
||||||
- `magatama-llm-v2-0:latest` → modified `11 days ago`
|
|
||||||
|
|
||||||
Interpretation:
|
|
||||||
- no newly imported Magatama candidate/adopted model is visible locally
|
|
||||||
- the active alias still points to an older model image
|
|
||||||
|
|
||||||
### 3. Dataset/lane export did work
|
|
||||||
|
|
||||||
Fresh Erik manifest exists:
|
|
||||||
- `/opt/magatama/training-data/runpod/magatamallm/manifest.json`
|
|
||||||
|
|
||||||
Observed:
|
|
||||||
- `generatedAt = 2026-05-06T22:45:15.944Z`
|
|
||||||
- `train = 15679`
|
|
||||||
- `eval = 1743`
|
|
||||||
- `total = 17422`
|
|
||||||
|
|
||||||
Interpretation:
|
|
||||||
- the lane export / pool sync is healthy
|
|
||||||
- training input exists and was rebuilt
|
|
||||||
|
|
||||||
### 4. Adoption/registry proof is missing
|
|
||||||
|
|
||||||
On Erik, these expected local state files were absent:
|
|
||||||
- `/opt/magatama/training-data/model-registry/models.json`
|
|
||||||
- `/opt/magatama/training-data/model-registry/runs.json`
|
|
||||||
- `/opt/magatama/training-data/model-registry/active.json`
|
|
||||||
- `/opt/magatama/data/llm-status.json`
|
|
||||||
|
|
||||||
Interpretation:
|
|
||||||
- no trustworthy proof that a new model artifact was imported, registered, and activated
|
|
||||||
|
|
||||||
### 5. Historical run records still show failed/non-adopted outcomes
|
|
||||||
|
|
||||||
Local `training-data/model-registry/training-runs.json` still contains recent `magatamallm` runs such as:
|
|
||||||
- `submitted`
|
|
||||||
- `not_found_after_submit`
|
|
||||||
|
|
||||||
There is still no verified “completed_and_adopted” proof for a new MagatamaLLM local model.
|
|
||||||
|
|
||||||
## Conclusion
|
|
||||||
|
|
||||||
Current state:
|
|
||||||
- pool refresh works
|
|
||||||
- lane export works
|
|
||||||
- active alias/version switching after training is still not proven
|
|
||||||
|
|
||||||
Therefore:
|
|
||||||
- MagatamaLLM did **not** yet gain a verified newer local knowledge state from the recent run attempts
|
|
||||||
- MAGATAMA is still operating on the older active alias `magatama-coder:latest`
|
|
||||||
|
|
||||||
## Next Required Fix
|
|
||||||
|
|
||||||
The remaining training-automation gap is still:
|
|
||||||
|
|
||||||
1. run completes
|
|
||||||
2. artifact existence is verified
|
|
||||||
3. artifact is adopted/imported locally
|
|
||||||
4. smoke tests pass
|
|
||||||
5. active alias + model version are updated
|
|
||||||
6. only then mark training as successful
|
|
||||||
Loading…
x
Reference in New Issue
Block a user