transceiver-db/sync/history/2026-05-09-magatama-live-atlas-fallback-and-lane-registry-fix.md

80 lines
2.5 KiB
Markdown

# 2026-05-09 — MAGATAMA live Atlas fallback and lane registry fix
## Summary
Two remaining truthfulness gaps were closed:
1. Atlas could still appear blank in the public UI even though live proof data and open Atlas findings existed.
2. The per-lane training registry on Erik still exposed `null` metadata, even though lane manifests and training-run state existed.
## Atlas
### Live truth after fix
- Public `protection-proof` now shows non-zero Atlas state again:
- `knownAssets: 57`
- `hostsWithTelemetry: 22`
- `assetsWithoutTelemetry: 35`
- `auditedHosts: 3`
- `queueBlocked: 28`
- Public findings API again shows open `atlas-coverage-gap` findings.
### Technical fix
- `packages/dashboard/public/index-v2.html`
- added `deriveProofFromAtlasSnapshot(...)`
- if the live/cached proof is empty or stale while the Atlas snapshot is still useful, the UI now synthesizes a fallback proof model from the snapshot
- result: Atlas top cards and sections no longer render as misleading blanks
## Lane registry
### Technical fix
- `scripts/model_registry_build.ts`
- now also reads:
- `training-data/model-registry/training-runs.json`
- `training-data/runpod/<lane>/manifest.json`
- compiled lane output now includes:
- `activeModel`
- `version`
- `lastTrainingAt`
- `lastRunId`
- `lastRunStatus`
- `datasetSource`
- `collectionsPath`
- lane runpod counts
### Live Erik verification
- `magatamallm`
- `activeModel: magatama-coder:latest`
- `lastRunStatus: completed_without_model_artifact`
- `datasetSource: url`
- `collectionsPath: /opt/magatama/training-data/runpod/magatamallm/manifest.json`
- `fo_blogllm`
- `activeModel: fo-blog-v7`
- `lastRunStatus: completed_without_model_artifact`
- `tip_llm`
- `activeModel: tip-llm-v1`
- `lastRunStatus: completed_without_model_artifact`
## Training reality
The managed RunPod Axolotl endpoint still does not return an adoptable model artifact. MAGATAMA is now honest about that:
- jobs can reach `COMPLETED`
- but the resulting lane registry and run registry record:
- `completed_without_model_artifact`
- therefore:
- no version bump is treated as successful
- no Ollama alias switch is performed
## Meaning
The frontend is now much harder to fool:
- Atlas no longer looks empty when only the proof route is stale
- lane registry metadata no longer collapses to all-`null`
- training state remains explicit about the real remaining blocker:
- custom RunPod worker still required for fully automatic artifact return and adoption