transceiver-db/sync/CURRENT.md
2026-05-06 22:53:41 +02:00

517 lines
27 KiB
Markdown

# Current TIP Sync State
Updated: 2026-05-06 20:52 UTC
## Active Policy
- Put coordination notes and handoffs in this `sync/` folder and push to Gitea.
- Check sibling project sync folders first when context may span repos.
- Use TIPLLM only for TIP crawler/robot planning and extraction feedback.
- Write robot/crawler experience into the Gitea-backed TIPLLM training pool.
- Keep Erik safe: no heavy crawler waves or uncontrolled Playwright/discovery jobs on Erik.
- Use Proxmox/Pi workers for crawl load.
## Cross-Repo Sync
Claude Code also created a Gitea sync handoff in the LLM Gateway repo:
- Repo: `rene/llm-gateway`
- Path: `sync/`
- Commit shown by Claude: `e272105 sync: add chat handoff + context scaffolding for Codex integration (2026-04-29)`
- Gitea path: `http://192.168.178.196:3000/rene/llm-gateway/src/main/sync/`
When work touches TIP, Magatama, LLM Gateway, bridges, auth, or shared Erik infrastructure, read both:
- `transceiver-db/sync/CURRENT.md`
- `llm-gateway/sync/CURRENT.md`
## Latest Work
- TIP/Blog lane separation was materially corrected on 2026-05-06:
- root cause:
- `TIP_LLM` was still ingesting blog-/writer-shaped rows from the canonical lane pool and shared transceiver corpora.
- local inspection showed the old TIP export had `6250` train rows, of which `6087` still matched blog/writer patterns.
- dataset builder and Gitea sync were hardened:
- `scripts/runpod_dataset_builder.ts`
- added strict `tipDatasetAllowed(...)`
- `TIP_LLM` now rejects blog-shaped source rows at dataset-build time
- `TIP_LLM` now rejects blog-like `system`, `user`, and markdown-article `assistant` patterns
- registry fallback for `TIP_LLM` now only uses lane-compatible datasets
- `scripts/sync_gitea_training_pool.ts`
- canonical TIP pool refresh now uses the stricter lane-alignment rules
- redundant `merged.jsonl` copies for `fo_blogllm` and `tip_llm` are no longer rewritten, to avoid local disk exhaustion from duplicate lane artifacts
- local disk issue encountered and fixed:
- full refresh failed with `ENOSPC` while writing `training-data/gitea-learning-pool/tip_llm/merged.jsonl`
- redundant lane `merged` artifacts for `fo_blogllm` and `tip_llm` were truncated and the sync script was changed to stop recreating them
- free disk space returned from `377Mi` to `17Gi`
- locally verified after rebuild:
- `TIP_LLM` RunPod export:
- `train = 233`
- `eval = 26`
- `total = 259`
- `blog/writer matches = 0`
- first TIP rows now use the correct TIP system prompt:
- `You are TIP_LLM, a research and market-intelligence analyst for transceivers, switches, and vendor ecosystems...`
- corrected artifacts and scripts were synced to Erik and `pnpm training:refresh-all` was rerun there.
- live verified on Erik/public API:
- `magatamallm`
- `datasetSource = url`
- `collectedExamples = 15679`
- `evalExamples = 1743`
- `totalExamples = 17422`
- `newSinceLastTraining = 15679`
- `fo_blogllm`
- `datasetSource = url`
- `collectedExamples = 17322`
- `evalExamples = 1926`
- `totalExamples = 19254`
- `neverTrained = true`
- `tip_llm`
- `datasetSource = url`
- `collectedExamples = 231`
- `evalExamples = 26`
- `totalExamples = 257`
- `neverTrained = true`
- operational conclusion:
- lane-specific dataset truth is now real on Erik.
- `TIP_LLM` is no longer silently borrowing the FO_Blog behavior lane.
- the next remaining hard problem is now RunPod artifact adoption/validation, not lane contamination.
- MAGATAMA frontend/runtime consistency was repaired again on 2026-05-06:
- dashboard and core were rebuilt locally and redeployed to Erik.
- live processes restarted successfully:
- `magatama-dashboard`
- `magatama`
- public `api/llm/status` now shows the true lane-export totals for `magatamallm`:
- `collectedExamples = 15620`
- `effectiveExamples = 15620`
- `evalExamples = 1736`
- `totalExamples = 17356`
- `newSinceLastTraining = 15620`
- root cause for the stale `1097` display:
- the RunPod start SSE path still logged the legacy deduplicated `fixes.jsonl` corpus.
- this was changed so RunPod launches no longer present the legacy `1097` count as the active training truth.
- after dataset refresh the UI now emits the lane manifest totals instead.
- RunPod completion handling was hardened:
- worker `COMPLETED` is no longer trusted blindly.
- MAGATAMA now scans RunPod worker logs for real training failures (`Traceback`, `SyntaxError`, non-zero exit, etc.) before treating the run as successful.
- if the worker logs show a hidden failure, MAGATAMA records this as `completed_with_worker_failure` instead of pretending the run succeeded.
- public findings state remains currently empty:
- `GET /api/findings?limit=1` returned `{"findings":[],"total":0}`
- this is now rendered with an explicit empty-state row instead of a visually blank table.
- Attack Paths empty-state is now intentionally explicit rather than looking broken.
- Frontend cache and scope handling were hardened:
- cache version bumped to `2026-05-06b`
- stale legacy `magatama_api_cache:*` entries are cleared
- per-endpoint TTLs added
- invalid or empty scope selections are normalized instead of silently leaving the UI in misleading empty views
- Switchblade rack port hover was materially improved:
- port chips now carry `data-tooltip`
- custom tooltip CSS is live on Erik
- the old browser-native “question mark only” behavior should be replaced by a readable hover bubble
- Changelog self-healing was added in core:
- stale cached changelog data older than 6h now forces a rebuild from git history
- verified live via dashboard proxy on Erik:
- `generatedAt = 2026-05-06T15:18:42.708Z`
- latest visible entries include `2026-04-30` items again instead of appearing frozen at `30.05`
- MAGATAMA lane-specific training pools and RunPod dataset automation were finished on 2026-05-06:
- root cause:
- the training modal always fetched `/api/llm/status` without a lane, so `FO_BlogLLM` and `TIP_LLM` still showed the `magatamallm` pool.
- dashboard/server were updated so `/api/llm/status?lane=...` is now truly lane-aware.
- the training modal now refreshes per selected lane and rewrites:
- title
- runtime label
- pool path
- counts
- dataset source
- MAGATAMA dashboard env on Erik was switched to URL dataset mode for all lanes via `ecosystem.config.cjs`:
- `RUNPOD_DATASET_SOURCE=url`
- `RUNPOD_DATASET_SOURCE_MAGATAMALLM=url`
- `RUNPOD_DATASET_SOURCE_FO_BLOGLLM=url`
- `RUNPOD_DATASET_SOURCE_TIP_LLM=url`
- live verified on Erik after restart:
- `fo_blogllm`
- `datasetSource = url`
- `collectionsPath = /opt/magatama/training-data/runpod/fo_blogllm/manifest.json`
- `train = 28`
- `eval = 4`
- `total = 32`
- `tip_llm`
- `datasetSource = url`
- `collectionsPath = /opt/magatama/training-data/runpod/tip_llm/manifest.json`
- `train = 36`
- `eval = 4`
- `total = 40`
- `magatamallm`
- remains on lane-export counts (`15620 / 1736 / 17356`)
- operator impact:
- no Hugging Face dataset publish is required anymore for MAGATAMA RunPod launches.
- every supported LLM lane now points to its own local/Gitea-backed lane export instead of reusing `magatamallm`.
- MAGATAMA training + Attack Paths + Atlas exposure were corrected again on 2026-05-06:
- the RunPod serverless training start failure was not a RunPod outage.
- root cause was missing training scripts on Erik (`training_full_refresh.ts` and related helpers were absent under `/opt/magatama/scripts`).
- Codex synced the full local `magatama/scripts/` tree to Erik, added a safe fallback in `scripts/model_registry_build.ts`, and synced the local `training-data/model-registry/` directory.
- verified on Erik:
- `pnpm training:refresh-all` now succeeds.
- fresh dataset totals after dedupe:
- `magatamallm`: `92,742` raw → `17,356` effective (`15,620 train / 1,736 eval`)
- `fo_blogllm`: `32` total (`28 train / 4 eval`)
- `tip_llm`: `40` total (`36 train / 4 eval`)
- important nuance:
- Codex did **not** execute the final Hugging Face publish step from Erik in this chat.
- local/script/build failures are fixed; external dataset publish still depends on the selected dataset source and explicit publish intent.
- MAGATAMA Attack Paths UX is no longer a misleading blank panel:
- the page now distinguishes between:
- no live attack paths
- historical fallback paths
- empty selected scope (`0 assets in scope`)
- when a user narrows the scope to a rack/location with zero scoped assets, the graph explicitly says so instead of looking broken.
- live dashboard HTML on Erik now contains:
- `Im aktuellen Scope liegen 0 Assets.`
- `Erweitere Standort oder Datacenter / Rack, damit MAGATAMA korrelierbare Assets und Pfade darstellen kann.`
- `Ohne offene mehrstufige Korrelationen bleibt die Graph-Sicht bewusst leer.`
- MAGATAMA code/training hardening was extended:
- `scripts/test_runpod_adapter.py` no longer loads tokenizer/model with `trust_remote_code=True`.
- `scripts/ollama_adapter_bridge.py` no longer loads tokenizer/model with `trust_remote_code=True`.
- this removed the live CODE finding around `HuggingFace trust_remote_code` on Erik.
- Atlas exposure logic was tightened to stop reopening noisy LAN management findings:
- generic `atlas-exposure` findings now only stay operationally open for exposure that is meaningful enough to track as a finding.
- internal RFC1918 management/service ports discovered by the broad atlas scan are no longer promoted into open Guard findings just because they exist on the LAN.
- host-specific posture for Proxmox / Erik / Mac Studio remains the job of explicit host-audit logic.
- after rebuild + deploy + health sync:
- live Postgres open findings returned to `0`.
- Follow-up hardening on the same block:
- the earlier RunPod error path in MAGATAMA dashboard was made more truthful.
- dataset preparation now distinguishes:
- local `training:refresh-all` failure
- optional Hugging Face publish failure
- URL-based dataset mode with no external publish required
- the training SSE flow now explicitly tells the operator whether RunPod is using:
- Hugging Face dataset source
- or MAGATAMA URL-bundle dataset source
- this avoids misleading `RunPod not reachable` wording when the actual failure is in dataset preparation.
- follow-up serverless verification on 2026-05-06 narrowed the remaining fault further:
- MAGATAMA submit logic now verifies that a RunPod job really exists under `/status/{jobId}` instead of trusting `/run`.
- payloads were aligned more closely with the official Axolotl serverless schema:
- `model_type=AutoModelForCausalLM`
- `tokenizer_type=AutoTokenizer`
- dataset `split: train`
- optimizer `adamw_torch_fused`
- verified full run attempt:
- job id `9bc4b16b-755b-465b-aadf-b46f2fe467a3-e2`
- disappeared as `not_found_after_submit` (`404 job not found`)
- verified canary after payload fix:
- job id `a4ac6951-7ed7-43cb-80d8-5ab61533c2da-e2`
- immediately materialized as `IN_QUEUE`
- then still disappeared on later reconcile as `not_found_after_submit`
- current conclusion:
- the old MAGATAMA bug is fixed.
- the remaining problem is now likely on the RunPod endpoint/release side: jobs are accepted and briefly queued, but do not survive long enough to produce a durable serverless status lifecycle.
- operational rule:
- do not treat `submitted` or a brief `IN_QUEUE` as proof of a usable serverless training run.
- only trust the run once it reaches `IN_PROGRESS` or a durable terminal state with artifact evidence.
- follow-up training count fix on 2026-05-06 corrected the Training UI source-of-truth:
- MAGATAMA had still shown `1097` because the dashboard was counting the legacy deduplicated fix corpus instead of the current lane-specific RunPod export.
- dashboard now prefers `training-data/runpod/magatamallm/manifest.json` for the visible MagatamaLLM training count.
- synced current lane export to Erik and restarted `magatama-dashboard`.
- verified public API now returns:
- `collectedExamples = 1367`
- `effectiveExamples = 1367`
- `evalExamples = 152`
- `totalExamples = 1519`
- `newSinceLastTraining = 1367`
- if the browser still shows `1097`, treat it as stale cached UI and hard reload.
- MAGATAMA was repaired end-to-end to a clean operational baseline:
- live guard host-audits for Erik, Mac Studio, and Proxmox were corrected and rerun.
- open findings were reduced all the way to `0` in Postgres.
- false-positive Proxmox baseline findings were removed by teaching the audit to treat internal-only management ports and default-only rpcbind exposure as acceptable for this host.
- code scanner false positives from generated/report artifacts remain excluded.
- Live MAGATAMA protection/runtime state after the 2026-05-06 remediation:
- `open findings: 0`
- `queueExecuting: 0`
- `queueBlocked: 0`
- `queueFailed: 0`
- public `/api/health` returns `status: ok`
- public `/api/active-resolvers` returns:
- `MAGATAMA Core: working`
- `MagatamaLLM: working`
- `Claude (secondary): working`
- `Codex (secondary/manual): idle`
- `Copilot (secondary/manual): idle`
- Important resolver truth fix on 2026-05-06:
- live `codex_enabled=false` in MAGATAMA settings was causing Codex to show as a broken resolver.
- dashboard logic was updated so disabled Codex/Copilot now show truthfully as `idle` with `In MAGATAMA settings disabled`, instead of pretending there is a runtime outage.
- the local codex bridge on Erik is reachable but currently reports `auth_required`; do not treat that as a production outage while Codex is intentionally disabled in settings.
- Remaining real operational gap after findings hit zero:
- MAGATAMA still knows more assets than it actively telemeters.
- last public protection proof showed:
- `knownAssets: 79`
- `hostsWithTelemetry: 27`
- `assetsWithoutTelemetry: 52`
- these are currently inventory/discovery-only assets, not open findings, but they remain the next real coverage expansion area.
- MAGATAMA cross-repo state from the same chat is now synced into this handoff:
- Compliance framework cards in MAGATAMA are clickable and open per-framework requirement details.
- MAGATAMA training status was corrected so `New Since Last Training` no longer falsely shows `0`.
- Live verified/deduped MAGATAMA training state after the fix:
- `collectedExamples: 49`
- `rawExamples: 58`
- `duplicateExamples: 9`
- `effectiveExamples: 49`
- `newSinceLastTraining: 49`
- MAGATAMA now filters training metrics to verified/trainable examples only.
- Failed/escalated MAGATAMA remediation records should go to `errors.jsonl`, not the main `fixes.jsonl`, so the next MagatamaLLM run does not train on junk.
- Gitea-backed training pool remains the default target for training writes.
- MAGATAMA coverage-gap and training-integrity hardening on 2026-05-06:
- the earlier `49` medium `atlas-coverage-gap` findings were traced to Atlas treating inventory-only and discovery-only assets as operational protection failures.
- core logic was tightened so Atlas coverage findings now open only for managed operational assets:
- exposure-backed assets
- explicit non-auto owner
- configured telemetry expectation
- critical/high criticality
- infrastructure metadata or managed infra device types
- loopback and passive reference/inventory assets no longer reopen noisy guard findings.
- local build succeeded, the new core dist was deployed to Erik, and the first post-deploy guard scan resolved stale findings.
- live Postgres state after deploy: `open findings = 0`.
- training integrity bug was fixed in `packages/core/src/learning/fix-tracking.ts`:
- verified fixes now append to `training-data/gitea-learning-pool/magatamallm/fixes.jsonl`
- failed/escalated/report-only runs now belong in `errors.jsonl`
- two explicit Codex-written training entries were appended to the MAGATAMA Gitea-backed fixes corpus:
- atlas coverage scope hardening
- training path integrity fix
- corpus cleanup + dedupe was executed afterward:
- pre-dedupe backup kept locally as:
- `magatama/training-data/gitea-learning-pool/magatamallm/fixes-pre-dedupe-20260506.jsonl`
- resulting verified corpus:
- `fixes.jsonl = 1,368` unique verified training rows
- resulting failure corpus:
- `errors.jsonl = 4` tracked failed/escalated rows
- integrity report now exists at:
- `magatama/training-data/gitea-learning-pool/magatamallm/corpus-integrity-report.json`
- latest integrity totals:
- `scanned: 1368`
- `verified: 1368`
- `movedToErrors: 4`
- `parseErrors: 0`
- `invalidVerifiedFlag: 0`
- Complete Codex chat sync was added:
- `sync/history/2026-04-29-codex-complete-chat-sync.md`
- captures Ghost/blog updates, LinkedIn voice preferences, LPO/AI-fabric blog edits, Rest-Is-Not-Laziness scheduling replacement, and security notes.
- confirms no secrets were written into sync.
- confirms TIP crawler/robot planning remains TIPLLM-only.
- confirms Erik remains controller/light `erik-safe` only, with heavy crawler work assigned to Proxmox/Pi workers.
- Codex sync-start confirmation was added:
- `sync/history/2026-04-29-codex-sync-start-confirmation.md`
- confirms Codex read this TIP handoff, checked the sibling LLM Gateway handoff, and is treating `sync/` as binding.
- no code changes, crawler jobs, queue waves, PM2 restarts, or Erik load were initiated during this confirmation.
- Codex follow-up on 2026-04-29 clarified the active BlogLLM model:
- TIP shows `fo-blog-v7`, but this is not a normal Ollama GGUF manifest.
- It is a local Adapter Bridge / Mac Studio model backed by the RunPod-trained PEFT adapter:
`/Users/renefichtmueller/Desktop/Claude Code/magatama/training-data/runpod/pod-runs/2026-04-25-fo-tip/final/adapters/fo_blogllm/final-adapter`
- Bridge definition:
`/Users/renefichtmueller/Desktop/Claude Code/magatama/scripts/ollama_adapter_bridge.py`
- TIP API default:
`packages/api/src/llm/client.ts` uses `OLLAMA_LLM_MODEL || "fo-blog-v7"`.
- `fo-blog-v8` remains the next training candidate, not the currently active TIP BlogLLM model.
- Full Codex session handoff was added:
- `sync/history/2026-04-29-codex-full-session-handoff.md`
- covers TIP verification, product image/detail crawling, Blog Engine Hot Topics, TIPLLM robots, training pool, Erik status, and cross-repo sync.
- Added a verification robot controller:
- `packages/scraper/src/robots/verification-robots.ts`
- command: `npm run robots:verification -w packages/scraper -- --status`
- Added TIPLLM robot experience writing:
- `packages/scraper/src/crawler-llm/training-data-writer.ts`
- writes raw robot audit rows and SFT records.
- Added Gitea training pool import to TIP learning-pool build:
- `scripts/tip-learning-pool-build.ts`
- imports `TIP_TRAINING_REPO/qa-pairs/*.jsonl` into the `tip_llm` lane.
- Added docs:
- `docs/TIP_SELFLEARNING_WORKFLOW.md`
- Added package script:
- `packages/scraper/package.json`
- `robots:verification`
## Gitea Training Pool
- Existing local clone: `/tmp/tip-training-data`
- Gitea repo: `rene/tip-training-data`
- Latest pushed training commit:
- `f1c83f8 crawl: add robot-status training records [2026-04-29T20:11:24.091Z]`
- First robot experience record was written to:
- `/tmp/tip-training-data/qa-pairs/robot-control-high.jsonl`
- `/tmp/tip-training-data/robot-experiences/2026-04-29.jsonl`
## MAGATAMA Training / Operations State
- Relevant local repo:
- `/Users/renefichtmueller/Desktop/Claude Code/magatama`
- Latest confirmed live MAGATAMA findings state:
- `open findings: 0` on `2026-05-06`
- Latest confirmed live resolver state:
- `Codex` and `Copilot` intentionally `idle/disabled`
- not a runtime outage, but a settings choice until gateway/bridge auth is intentionally re-enabled
- Latest confirmed live MAGATAMA training metric after dashboard fix:
- `newSinceLastTraining: 49`
- Meaning:
- the old `0` was incorrect.
- the currently visible trainable MAGATAMA corpus is based on verified and deduplicated examples only.
- Latest corpus integrity state after cleanup:
- operational Gitea-backed MAGATAMA training corpus is now much smaller but cleaner:
- `1368` unique verified rows
- `4` live failure/escalation rows in `errors.jsonl`
- do not confuse raw historical volume with real trainable signal.
- Important training integrity rule:
- report-only or failed/escalated records must not be treated as verified training fixes.
- keep them separated from the main verified training corpus.
## Erik Status
- Synced TIPLLM robot/training code to `/opt/tip`.
- Did not start crawler jobs.
- Did not enqueue robot waves.
- Did not restart PM2 services.
- Remote scraper TypeScript build is passing after removing two stale misplaced remote-only duplicate files:
- `/opt/tip/packages/scraper/src/scrapers/scheduler.ts`
- `/opt/tip/packages/scraper/src/vendor-discovery-crawler.ts`
- `tip-api` and `tip-scraper-daemon` are online.
- Shared Erik note from the same chat:
- MAGATAMA dashboard/core were redeployed during compliance/training fixes.
- TIP crawler policy remains unchanged: Erik is controller/light runner only, not heavy crawl execution host.
## Last Live Verification Snapshot
From 2026-04-29:
- Total transceivers: `13,546`
- Price verified: `7,250`
- Image verified: `7,025`
- Details verified: `6,243`
- Fully verified: `5,812`
- Last price observation: `2026-04-29 19:15:53 UTC`
- Last stock observation: `2026-04-29 19:15:56 UTC`
## Latest MAGATAMA Training / RunPod Truth
Confirmed on `2026-05-06`:
- Lane-specific training pools are now materially separated and no longer all fallback to `magatamallm`.
- Live Erik dashboard API now reports:
- `magatamallm`
- `1367 train`
- `152 eval`
- `1519 total`
- `newSinceLastTraining = 1367`
- `fo_blogllm`
- `17353 train`
- `1929 eval`
- `19282 total`
- `newSinceLastTraining = 17353`
- active local model resolves to `fo-blog-v7`
- `tip_llm`
- `6482 train`
- `721 eval`
- `7203 total`
- `newSinceLastTraining = 6482`
- target active model is `tip-llm-v1`, but this model is not yet present locally in Ollama
- Result:
- previous `1097` everywhere was stale / wrong.
- selected lane now controls its own manifest, model label, and training counts.
### Gitea-backed Pool Materialization
- `magatamallm` Gitea pool remains canonical and populated.
- `fo_blogllm` and `tip_llm` Gitea-backed pool folders were previously almost empty; they are now materialized from the local RunPod lane exports.
- Lane manifests and JSONL exports now exist under:
- `training-data/gitea-learning-pool/fo_blogllm/`
- `training-data/gitea-learning-pool/tip_llm/`
### RunPod Completion Hardening
- MAGATAMA dashboard code now treats RunPod `COMPLETED` as success only after:
1. target model artifact is referenced
2. local Mac training API adopts/imports the artifact
3. lane-specific smoke tests pass
4. active Ollama alias is updated
- New local adoption endpoint is:
- `POST /adopt-runpod-model`
### Mac Training API State
- The old LaunchAgent on Mac Studio was still serving the legacy training API from:
- `~/magatama-llm/service/training_api.py`
- It has now been upgraded in place so Erik sees the new adoption-capable API.
- Verified from Erik:
- `http://192.168.178.213:3214/health` returns the new service
- it now exposes `register_script` pointing into the MAGATAMA repo
- `POST /adopt-runpod-model` exists and rejects unauthenticated requests with `401`, proving the route is live
### Still Outstanding
- A fully successful end-to-end RunPod fine-tune with:
- real worker success
- real artifact
- successful local Ollama import
- active alias switch
- smoke-test proof
has not yet been re-verified after the new adoption pipeline was wired in.
- Latest live proof run on `2026-05-06`:
- job id: `2112a7ab-68c2-4411-a44f-6edb7ad377df-e1`
- materialized correctly
- reached `IN_PROGRESS`
- then `COMPLETED`
- but RunPod `status/{job}` returned no `output` object, no model artifact reference, and no Hugging Face repo result
- current MAGATAMA handling now correctly classifies this as `completed_without_model_artifact`, not as success
- `tip_llm-v1` is still not installed locally in Ollama.
### Pulso AI Recommendation
- Keep a shared network/transceiver/switch core corpus with TIP.
- Do not collapse `Pulso AI` into the same instruction lane as `TIP_LLM`.
- Recommended split:
- `TIP_LLM`
- research
- crawler / scraper / robot planning
- vendor / firmware / issue extraction
- `Pulso AI`
- product responses
- support
- diagnostics
- operator explanation layer
## Safe Next Steps
1. Clone or pull Gitea `origin` on laptop/Claude Code.
2. Read this folder first.
3. For BlogLLM work, treat `fo-blog-v7` as Adapter Bridge / PEFT adapter, not as a `~/.ollama` GGUF model.
4. Also read `llm-gateway/sync/CURRENT.md` when work touches shared Erik infrastructure, LLM routing, bridges, auth, TIPLLM, or crawler orchestration.
5. For TIP robot/crawler planning, use TIPLLM only. Do not route this lane through external AI providers.
6. When training pools or model stats look suspicious, prefer verified-only counts and check whether failed/escalated rows polluted the corpus.
7. For MAGATAMA-adjacent work, keep writing learnings back into the Gitea-backed pool and avoid training on report-only pseudo-fixes.
8. If testing robots, start with dry runs only:
```bash
npm run robots:verification -w packages/scraper -- --status
npm run robots:verification -w packages/scraper -- --tipllm-plan --limit=3
npm run robots:verification -w packages/scraper -- --enqueue=details-fast-lane --profile=erik-safe --dry-run
```
9. Only dispatch real crawl work after deciding the target host:
- Erik: `erik-safe`, tiny batches only.
- Pi: `pi-fetch`.
- Proxmox: `proxmox-heavy`.
## Dirty Worktree Note
There are existing uncommitted changes outside `sync/`. Some are Codex work from this session, some appear pre-existing or from earlier Claude/Codex work. Do not blindly revert them. Review `git status --short` before committing broader changes.
## Latest Sync Commits
- `6c42ca7 docs: add shared agent sync handoff`
- `8e7c5aa docs: link llm-gateway sync handoff`
- Pending after this update:
- watch whether any future guard exposure findings are genuine operational issues or new false positives.
- if failures still appear inside `fixes.jsonl`, scrub historic pollution and backfill `errors.jsonl`.