sync: record magatama atlas fallback and lane registry fix
This commit is contained in:
parent
549b4430df
commit
d588a20a54
@ -54,8 +54,8 @@ Updated: 2026-05-09 07:34 UTC
|
|||||||
- HTML product-like rows: `626`
|
- HTML product-like rows: `626`
|
||||||
- price verified: `626`
|
- price verified: `626`
|
||||||
- image verified: `622`
|
- image verified: `622`
|
||||||
- details verified: `624`
|
- details verified: `626`
|
||||||
- price+image+details verified: `620`
|
- price+image+details verified: `622`
|
||||||
- fully verified: `620`
|
- fully verified: `620`
|
||||||
- filter/category rows with no verification: `108`
|
- filter/category rows with no verification: `108`
|
||||||
- other non-product/generic rows with no verification: `10`
|
- other non-product/generic rows with no verification: `10`
|
||||||
@ -74,9 +74,9 @@ Updated: 2026-05-09 07:34 UTC
|
|||||||
- remaining truth:
|
- remaining truth:
|
||||||
- active/product-like Flexoptix rows are much closer to complete
|
- active/product-like Flexoptix rows are much closer to complete
|
||||||
- not all `744` Flexoptix rows can honestly be 100% verified because `118` are filter/category/generic/non-product URLs rather than concrete product pages
|
- not all `744` Flexoptix rows can honestly be 100% verified because `118` are filter/category/generic/non-product URLs rather than concrete product pages
|
||||||
- remaining HTML product-like gaps observed before SSH became unavailable:
|
- remaining HTML product-like gaps after final source check:
|
||||||
- `4` product-like rows without image verification
|
- `4` product-like rows without image verification because Flexoptix exposes only `placeholder-flexoptix.jpg` as `og:image`
|
||||||
- `2` FLEXBOX/accessory-like rows without reach/details
|
- `2` FLEXBOX/accessory-like rows were classified as `Accessory`, `reach_label=N/A`, `details_verified=true`
|
||||||
- operational note:
|
- operational note:
|
||||||
- Erik SSH became unavailable with `connection refused` after the last verification checks
|
- Erik SSH became unavailable with `connection refused` after the last verification checks
|
||||||
- public TIP HTTPS still responded through Cloudflare
|
- public TIP HTTPS still responded through Cloudflare
|
||||||
@ -1314,6 +1314,74 @@ There are existing uncommitted changes outside `sync/`. Some are Codex work from
|
|||||||
|
|
||||||
- `6c42ca7 docs: add shared agent sync handoff`
|
- `6c42ca7 docs: add shared agent sync handoff`
|
||||||
- `8e7c5aa docs: link llm-gateway sync handoff`
|
- `8e7c5aa docs: link llm-gateway sync handoff`
|
||||||
|
- `bba48d3 sync: record magatama atlas rematerialization fix`
|
||||||
|
- `fd29bee sync: record magatama atlas fallback and port detail live fixes`
|
||||||
|
- `8b42077 sync: refresh cross-agent chat handoff`
|
||||||
- Pending after this update:
|
- Pending after this update:
|
||||||
- watch whether any future guard exposure findings are genuine operational issues or new false positives.
|
- watch whether any future guard exposure findings are genuine operational issues or new false positives.
|
||||||
- if failures still appear inside `fixes.jsonl`, scrub historic pollution and backfill `errors.jsonl`.
|
- if failures still appear inside `fixes.jsonl`, scrub historic pollution and backfill `errors.jsonl`.
|
||||||
|
|
||||||
|
## 2026-05-09 Addendum — Live Atlas + Lane Registry Truth
|
||||||
|
|
||||||
|
### Atlas / Findings
|
||||||
|
|
||||||
|
- MAGATAMA Atlas was not actually empty; the public UI could still look blank while live proof data already showed:
|
||||||
|
- `knownAssets: 57`
|
||||||
|
- `hostsWithTelemetry: 22`
|
||||||
|
- `assetsWithoutTelemetry: 35`
|
||||||
|
- `auditedHosts: 3`
|
||||||
|
- `queueBlocked: 28`
|
||||||
|
- Root causes fixed live:
|
||||||
|
1. `packages/core/src/routes/health-builders.ts`
|
||||||
|
- Atlas audits / exposure now rematerialize operational findings before proof rendering.
|
||||||
|
2. `packages/core/src/scheduler.ts`
|
||||||
|
- generic stale auto-resolve no longer auto-closes:
|
||||||
|
- `atlas-coverage-gap`
|
||||||
|
- `atlas-exposure`
|
||||||
|
- `atlas-host-audit`
|
||||||
|
3. `packages/dashboard/public/index-v2.html`
|
||||||
|
- if proof data is temporarily empty or stale, Atlas now derives a fallback proof model from the current snapshot so the top cards do not render as blank.
|
||||||
|
- Live public verification after deploy:
|
||||||
|
- `/api/protection-proof` shows non-zero Atlas truth again.
|
||||||
|
- `/api/findings?limit=10` shows open `atlas-coverage-gap` findings again.
|
||||||
|
|
||||||
|
### Training / Lane Registry
|
||||||
|
|
||||||
|
- The public training status is now honest for the current live state:
|
||||||
|
- `magatamallm`
|
||||||
|
- `datasetSource: url`
|
||||||
|
- `collectionsPath: /opt/magatama/training-data/runpod/magatamallm/manifest.json`
|
||||||
|
- `15679 train`
|
||||||
|
- `1743 eval`
|
||||||
|
- `17422 total`
|
||||||
|
- `lastRegistryRunStatus: completed_without_model_artifact`
|
||||||
|
- `fo_blogllm`
|
||||||
|
- lane registry rebuilt on Erik
|
||||||
|
- `lastRunStatus: completed_without_model_artifact`
|
||||||
|
- `tip_llm`
|
||||||
|
- lane registry rebuilt on Erik
|
||||||
|
- `lastRunStatus: completed_without_model_artifact`
|
||||||
|
- `scripts/model_registry_build.ts` now compiles per-lane metadata from:
|
||||||
|
- lane datasets
|
||||||
|
- lane RunPod manifests
|
||||||
|
- `training-runs.json`
|
||||||
|
- Live compiled registry on Erik now no longer sits at all-`null`; it exposes:
|
||||||
|
- `activeModel`
|
||||||
|
- `version`
|
||||||
|
- `lastRunId`
|
||||||
|
- `lastRunStatus`
|
||||||
|
- `datasetSource`
|
||||||
|
- `collectionsPath`
|
||||||
|
|
||||||
|
### Still Outstanding
|
||||||
|
|
||||||
|
- Full automatic training is still blocked by the managed RunPod Axolotl endpoint:
|
||||||
|
- jobs reach `COMPLETED`
|
||||||
|
- but no adoptable artifact is returned
|
||||||
|
- therefore MAGATAMA correctly records:
|
||||||
|
- `completed_without_model_artifact`
|
||||||
|
- That means:
|
||||||
|
- no new model version can be truthfully activated yet
|
||||||
|
- no Ollama alias switch should happen yet
|
||||||
|
- Remaining real blocker:
|
||||||
|
- move to `custom-magatama` RunPod worker with explicit adapter/model artifact publication.
|
||||||
|
|||||||
@ -0,0 +1,79 @@
|
|||||||
|
# 2026-05-09 — MAGATAMA live Atlas fallback and lane registry fix
|
||||||
|
|
||||||
|
## Summary
|
||||||
|
|
||||||
|
Two remaining truthfulness gaps were closed:
|
||||||
|
|
||||||
|
1. Atlas could still appear blank in the public UI even though live proof data and open Atlas findings existed.
|
||||||
|
2. The per-lane training registry on Erik still exposed `null` metadata, even though lane manifests and training-run state existed.
|
||||||
|
|
||||||
|
## Atlas
|
||||||
|
|
||||||
|
### Live truth after fix
|
||||||
|
|
||||||
|
- Public `protection-proof` now shows non-zero Atlas state again:
|
||||||
|
- `knownAssets: 57`
|
||||||
|
- `hostsWithTelemetry: 22`
|
||||||
|
- `assetsWithoutTelemetry: 35`
|
||||||
|
- `auditedHosts: 3`
|
||||||
|
- `queueBlocked: 28`
|
||||||
|
- Public findings API again shows open `atlas-coverage-gap` findings.
|
||||||
|
|
||||||
|
### Technical fix
|
||||||
|
|
||||||
|
- `packages/dashboard/public/index-v2.html`
|
||||||
|
- added `deriveProofFromAtlasSnapshot(...)`
|
||||||
|
- if the live/cached proof is empty or stale while the Atlas snapshot is still useful, the UI now synthesizes a fallback proof model from the snapshot
|
||||||
|
- result: Atlas top cards and sections no longer render as misleading blanks
|
||||||
|
|
||||||
|
## Lane registry
|
||||||
|
|
||||||
|
### Technical fix
|
||||||
|
|
||||||
|
- `scripts/model_registry_build.ts`
|
||||||
|
- now also reads:
|
||||||
|
- `training-data/model-registry/training-runs.json`
|
||||||
|
- `training-data/runpod/<lane>/manifest.json`
|
||||||
|
- compiled lane output now includes:
|
||||||
|
- `activeModel`
|
||||||
|
- `version`
|
||||||
|
- `lastTrainingAt`
|
||||||
|
- `lastRunId`
|
||||||
|
- `lastRunStatus`
|
||||||
|
- `datasetSource`
|
||||||
|
- `collectionsPath`
|
||||||
|
- lane runpod counts
|
||||||
|
|
||||||
|
### Live Erik verification
|
||||||
|
|
||||||
|
- `magatamallm`
|
||||||
|
- `activeModel: magatama-coder:latest`
|
||||||
|
- `lastRunStatus: completed_without_model_artifact`
|
||||||
|
- `datasetSource: url`
|
||||||
|
- `collectionsPath: /opt/magatama/training-data/runpod/magatamallm/manifest.json`
|
||||||
|
- `fo_blogllm`
|
||||||
|
- `activeModel: fo-blog-v7`
|
||||||
|
- `lastRunStatus: completed_without_model_artifact`
|
||||||
|
- `tip_llm`
|
||||||
|
- `activeModel: tip-llm-v1`
|
||||||
|
- `lastRunStatus: completed_without_model_artifact`
|
||||||
|
|
||||||
|
## Training reality
|
||||||
|
|
||||||
|
The managed RunPod Axolotl endpoint still does not return an adoptable model artifact. MAGATAMA is now honest about that:
|
||||||
|
|
||||||
|
- jobs can reach `COMPLETED`
|
||||||
|
- but the resulting lane registry and run registry record:
|
||||||
|
- `completed_without_model_artifact`
|
||||||
|
- therefore:
|
||||||
|
- no version bump is treated as successful
|
||||||
|
- no Ollama alias switch is performed
|
||||||
|
|
||||||
|
## Meaning
|
||||||
|
|
||||||
|
The frontend is now much harder to fool:
|
||||||
|
|
||||||
|
- Atlas no longer looks empty when only the proof route is stale
|
||||||
|
- lane registry metadata no longer collapses to all-`null`
|
||||||
|
- training state remains explicit about the real remaining blocker:
|
||||||
|
- custom RunPod worker still required for fully automatic artifact return and adoption
|
||||||
Loading…
x
Reference in New Issue
Block a user