sync: record magatama all-lane training completion
This commit is contained in:
parent
0599991431
commit
cf30735ef1
@ -1,9 +1,36 @@
|
|||||||
# Current TIP Sync State
|
# Current TIP Sync State
|
||||||
|
|
||||||
Updated: 2026-05-09 23:38 UTC
|
Updated: 2026-05-10 02:58 UTC
|
||||||
|
|
||||||
## Newest Work
|
## Newest Work
|
||||||
|
|
||||||
|
- MAGATAMA all-lane RunPod training completion on 2026-05-10:
|
||||||
|
- RunPod training/adoption is now verified end-to-end for all five active MAGATAMA LLM lanes:
|
||||||
|
- `magatamallm`: active `magatama-coder:latest`, model version `magatama-coder-r2`, dataset `1375 train / 153 eval / 1528 total`
|
||||||
|
- `fo_blogllm`: active `fo-blog-v8`, model version `fo-blog-v8-r2`, dataset `17342 train / 1929 eval / 19271 total`
|
||||||
|
- `tip_llm`: active `tip-llm-v2`, model version `tip-llm-v2-r2`, dataset `276 train / 31 eval / 307 total`
|
||||||
|
- `pulso_llm`: active `pulso-llm-v1`, model version `pulso-llm-v1-r1`, dataset `28 train / 5 eval / 33 total`
|
||||||
|
- `contact_llm`: active `contact-llm-v1`, model version `contact-llm-v1-r1`, dataset `18 train / 4 eval / 22 total`
|
||||||
|
- strict adoption rule is now validated in production:
|
||||||
|
- RunPod `COMPLETED` alone is not a success
|
||||||
|
- success requires uploaded adapter artifact, local Mac adoption, Ollama model registration, smoke tests, registry write, dashboard registry rebuild and active alias switch
|
||||||
|
- fixed/verified automation behavior:
|
||||||
|
- local Mac adoption service exposes authenticated adoption reports per lane via `/adoption-report/{lane}`
|
||||||
|
- dashboard adoption path can recover from transient network/fetch errors by reading the local adoption report
|
||||||
|
- reconciler can adopt already-completed RunPod jobs when the live SSE path failed after artifact upload
|
||||||
|
- registry events now include top-level `active_model`, `release_alias`, `model_version`, `version_counter` and `candidate_model`
|
||||||
|
- resolved concrete failures:
|
||||||
|
- `pulso_llm` training had succeeded, but old local lane mapping caused `unknown lane: pulso_llm`; Pulso is now adopted and active
|
||||||
|
- `tip_llm` training succeeded but local adoption failed due low Mac disk space before GGUF conversion; safe obsolete Ollama versions and imported intermediate GGUFs were removed, then TIP was reconciled successfully
|
||||||
|
- `contact_llm` was still `neverTrained`; it is now trained, adopted and active
|
||||||
|
- ContactLLM smoke test result:
|
||||||
|
- `4/5` checks passed
|
||||||
|
- remaining improvement: provenance prompt should always include source URL, timestamp, confidence and contact type; add this as a next training/eval item
|
||||||
|
- public Magatama `/api/llm/status?lane=...` checks after dashboard restart show all five lanes as `completed_and_adopted`
|
||||||
|
- operational note:
|
||||||
|
- keep enough Mac free space before another adoption; each new 7B adapter adoption needs merge + GGUF conversion workspace
|
||||||
|
- obsolete non-active Ollama versions can be removed after verifying active aliases and release aliases exist
|
||||||
|
|
||||||
- TIP price/source verification closure on 2026-05-10 local / 2026-05-09 UTC:
|
- TIP price/source verification closure on 2026-05-10 local / 2026-05-09 UTC:
|
||||||
- fixed SFPcables scraper to persist `product_page_url`
|
- fixed SFPcables scraper to persist `product_page_url`
|
||||||
- added product-page price fallback for SFPcables when listing pages omit price markup
|
- added product-page price fallback for SFPcables when listing pages omit price markup
|
||||||
|
|||||||
@ -0,0 +1,48 @@
|
|||||||
|
# MAGATAMA All-Lane RunPod Training Complete
|
||||||
|
|
||||||
|
Date: 2026-05-10 02:58 UTC
|
||||||
|
|
||||||
|
## Result
|
||||||
|
|
||||||
|
All five MAGATAMA trainable LLM lanes completed a real RunPod training/adoption cycle and are now visible as adopted in the public MAGATAMA status API.
|
||||||
|
|
||||||
|
## Verified Lanes
|
||||||
|
|
||||||
|
- `magatamallm`: active `magatama-coder:latest`, model version `magatama-coder-r2`, `1375 train / 153 eval / 1528 total`
|
||||||
|
- `fo_blogllm`: active `fo-blog-v8`, model version `fo-blog-v8-r2`, `17342 train / 1929 eval / 19271 total`
|
||||||
|
- `tip_llm`: active `tip-llm-v2`, model version `tip-llm-v2-r2`, `276 train / 31 eval / 307 total`
|
||||||
|
- `pulso_llm`: active `pulso-llm-v1`, model version `pulso-llm-v1-r1`, `28 train / 5 eval / 33 total`
|
||||||
|
- `contact_llm`: active `contact-llm-v1`, model version `contact-llm-v1-r1`, `18 train / 4 eval / 22 total`
|
||||||
|
|
||||||
|
## Fixes Made
|
||||||
|
|
||||||
|
- Added/verified first-class local adoption support for `pulso_llm` and `contact_llm`.
|
||||||
|
- Added authenticated adoption-report recovery endpoint on the Mac training/adoption service.
|
||||||
|
- Hardened dashboard adoption flow so transient network/fetch errors can recover from local adoption reports.
|
||||||
|
- Hardened RunPod reconciler so completed jobs can be adopted after a failed live SSE/browser path.
|
||||||
|
- Registry success events now include explicit active model, release alias, model version, version counter and candidate model.
|
||||||
|
- Rebuilt the MAGATAMA model registry and restarted `magatama-dashboard` after successful TIP and Contact adoption.
|
||||||
|
|
||||||
|
## Issues Resolved
|
||||||
|
|
||||||
|
- `pulso_llm` showed `unknown lane: pulso_llm` after RunPod finished; this was a local adoption mapping issue, not a training failure. Pulso is now active.
|
||||||
|
- `tip_llm` failed local adoption because Mac disk space dropped below the GGUF conversion threshold. Obsolete non-active Ollama versions and already imported intermediate GGUFs were removed, then TIP was reconciled successfully.
|
||||||
|
- `contact_llm` had never been trained before this block. It now has a first adopted version.
|
||||||
|
|
||||||
|
## Evaluation Notes
|
||||||
|
|
||||||
|
- ContactLLM smoke test passed `4/5`.
|
||||||
|
- Open improvement: ContactLLM should consistently return provenance fields for public business contacts: source URL, timestamp, confidence and contact type.
|
||||||
|
|
||||||
|
## Operating Rule
|
||||||
|
|
||||||
|
Do not mark RunPod training successful on `COMPLETED` alone. A successful lane run must have:
|
||||||
|
|
||||||
|
- uploaded adapter artifact
|
||||||
|
- successful local Mac adoption
|
||||||
|
- Ollama candidate + release alias + active alias
|
||||||
|
- smoke tests meeting threshold
|
||||||
|
- registry entry with `completed_and_adopted`
|
||||||
|
- public MAGATAMA `/api/llm/status?lane=...` showing the new active model/version
|
||||||
|
|
||||||
|
No secrets, tokens or credentials are recorded in this handoff.
|
||||||
Loading…
x
Reference in New Issue
Block a user