# TIP Lane Detangling And Disk-Safe Refresh

Date: 2026-05-06 UTC

## Summary

`TIP_LLM` was still contaminated by blog/writer behavior even though lane-specific counts were already separated in MAGATAMA. The problem was not only UI-level status, but the actual lane corpus feeding the RunPod export.

The lane was rebuilt and revalidated locally, then synced to Erik and refreshed there. The result is that `TIP_LLM` now uses a much smaller but correctly aligned research/network corpus instead of silently inheriting FO_Blog-like behavior.

## Root Cause

- The canonical `training-data/gitea-learning-pool/tip_llm/*.jsonl` pool still contained many blog-shaped rows from shared transceiver corpora.
- The old TIP export sampled thousands of rows whose prompts/messages still looked like:
  - `You are an expert technical writer...`
  - publication-ready/blog instructions
- A direct local check on the pre-fix TIP export showed:
  - `6250` train rows
  - `6087` matched blog/writer patterns

## Changes Applied

### `scripts/runpod_dataset_builder.ts`

- Added a stricter `tipDatasetAllowed(...)` gate.
- Tightened `laneRecordIsCompatible(...)` for `tip_llm`.
- Tightened `lanePoolMessagesAlign(...)` for `tip_llm`:
  - reject:
    - `blog writer`
    - `publication-ready`
    - `technical writer specializing`
    - article-outline/founder/blog prompts
    - markdown-article assistant outputs
- TIP registry fallback now only considers lane-compatible datasets.

### `scripts/sync_gitea_training_pool.ts`

- Applied the same stricter TIP lane-alignment logic.
- Stopped rewriting redundant `merged.jsonl` copies for:
  - `fo_blogllm`
  - `tip_llm`
- This was necessary because the duplicated merged artifacts caused local disk exhaustion during refresh.

## Disk Incident

During the first rebuild after the lane hardening, refresh failed with:

- `ENOSPC: no space left on device`

The immediate cause was writing:

- `training-data/gitea-learning-pool/tip_llm/merged.jsonl`

Fix:

- truncated redundant `merged` artifacts for `fo_blogllm` and `tip_llm`
- changed sync logic so those duplicates are no longer recreated

Result:

- free disk space recovered from roughly `377Mi` to `17Gi`

## Verified Local Result

After rebuild:

- `TIP_LLM`
  - `train = 233`
  - `eval = 26`
  - `total = 259`
  - `blog/writer matches = 0`

First rows now use the intended TIP instruction style:

- `You are TIP_LLM, a research and market-intelligence analyst for transceivers, switches, and vendor ecosystems...`

This confirms the lane is no longer silently shaped like FO_Blog.

## Synced To Erik

Synced:

- updated scripts:
  - `runpod_dataset_builder.ts`
  - `sync_gitea_training_pool.ts`
  - `submit_runpod_training.ts`
- rebuilt lane exports:
  - `training-data/runpod/magatamallm/*`
  - `training-data/runpod/fo_blogllm/*`
  - `training-data/runpod/tip_llm/*`

Then reran on Erik:

- `pnpm training:refresh-all`

## Live Erik / Public API Result

### `magatamallm`

- `datasetSource = url`
- `collectedExamples = 15679`
- `evalExamples = 1743`
- `totalExamples = 17422`
- `newSinceLastTraining = 15679`

### `fo_blogllm`

- `datasetSource = url`
- `collectedExamples = 17322`
- `evalExamples = 1926`
- `totalExamples = 19254`
- `neverTrained = true`

### `tip_llm`

- `datasetSource = url`
- `collectedExamples = 231`
- `evalExamples = 26`
- `totalExamples = 257`
- `neverTrained = true`

## Remaining Work

The next remaining hard blocker is no longer lane contamination.

It is now:

- RunPod artifact validation/adoption

Desired next step:

1. only accept RunPod `COMPLETED` as success if a real artifact exists
2. verify artifact importability
3. update/adopt local Ollama tag automatically
4. switch MAGATAMA only after successful adoption
5. run pre/post smoke prompts