3.5 KiB
3.5 KiB
MAGATAMA Multi-LLM Training Lanes
Date: 2026-05-09
Decision
MAGATAMA now treats the core specialized models as separate training lanes with separate behavior pools:
magatamallm: MAGATAMA operations, cybersecurity, AI security, infrastructure security, resolver/fix workflows.fo_blogllm: Rene/Flexoptix-style blog writing, technical storytelling, market/blog structure.tip_llm: crawler/scraper/robot planning, source discovery, parser/selectors, switch/transceiver issue research.pulso_llm: Flexoptix product/support/diagnostic lane for switches, transceivers, compatibility, product fit and offers.contact_llm: structured, lawful contact discovery/research with attribution.
TIP_LLM and PulsoLLM share a network/transceiver/switch knowledge core, but their instruction and behavior pools stay separate.
Implemented In MAGATAMA
- Added
pulso_llmandcontact_llmto:- RunPod dataset builder
- Gitea training pool sync
- model registry build
- RunPod submit path
- HuggingFace/RunPod dataset publishing config
- Susan/NAS training scan
- full training refresh
- dashboard training API/status
- training modal UI
- fine-tuner lane config and smoke prompts
- Added new lane profiles:
PulsoLLMContactLLM
- Added source catalog:
training-data/model-registry/research-source-catalog-2026-05-09.jsontraining-data/model-registry/external-ingest/llm-lane-research-seeds-2026-05-09.jsonl
- Added Gitea-backed seed pools:
training-data/gitea-learning-pool/pulso_llm/training-data/gitea-learning-pool/contact_llm/
Source Seeds Added
- CISA KEV / CISA Malcolm / CISA ScubaGear
- NVD CVE API
- MITRE ATT&CK STIX/TAXII
- OWASP LLM Top 10
- Microsoft PyRIT
- Microsoft Agent Governance Toolkit
- Cisco Transceiver Module Group matrix
- Juniper Hardware Compatibility Tool
- Arista transceiver/cable references
- Flexoptix product/support references
- RFC 9309 robots.txt
- schema.org
ContactPoint - RFC 6350 vCard
- PeeringDB API
- RIPE Database REST API
Verified Counts
RunPod lane exports rebuilt and deployed live on Erik:
magatamallm:1375 train,153 eval,1528 totalfo_blogllm:17342 train,1929 eval,19271 totaltip_llm:276 train,31 eval,307 totalpulso_llm:28 train,5 eval,33 totalcontact_llm:18 train,4 eval,22 total
Live Verification
pulso_llmandcontact_llmappear in the MAGATAMA training modal.- RunPod provider is online for both new lanes.
contact_llmstatus correctly reportsneverTrained: true.pulso_llm/contact_llmare trainable but not adopted yet because no local Ollama model tags exist yet.
Gitea / Privacy Closure
- Sync handoff commit:
3926a1e - MAGATAMA implementation and training-pool commit:
8fb406b - MAGATAMA pre-commit correctly blocked the first attempt because raw training rows contained private-network data.
- Export path is now hardened:
- private IPs are replaced with placeholders
- local
/Users/...paths are replaced - emails, tokens, secrets and passwords are redacted
- The pushed MAGATAMA commit passed:
- secrets scan
- private data scan
- config values scan
Operational Rule
Training success only counts when all of the following are true:
- RunPod reports completion.
- An artifact exists and is reachable.
- Local import succeeds.
- Smoke tests pass.
- The active alias/version is switched.
- The registry and dashboard show the new version.
If any part fails, the lane must stay in a non-adopted state.