transceiver-db/sync/CURRENT.md

# Current TIP Sync State

Updated: 2026-04-29 21:15 UTC

## Active Policy

- Put coordination notes and handoffs in this `sync/` folder and push to Gitea.
- Check sibling project sync folders first when context may span repos.
- Use TIPLLM only for TIP crawler/robot planning and extraction feedback.
- Write robot/crawler experience into the Gitea-backed TIPLLM training pool.
- Keep Erik safe: no heavy crawler waves or uncontrolled Playwright/discovery jobs on Erik.
- Use Proxmox/Pi workers for crawl load.

## Cross-Repo Sync

Claude Code also created a Gitea sync handoff in the LLM Gateway repo:

- Repo: `rene/llm-gateway`
- Path: `sync/`
- Commit shown by Claude: `e272105 sync: add chat handoff + context scaffolding for Codex integration (2026-04-29)`
- Gitea path: `http://192.168.178.196:3000/rene/llm-gateway/src/main/sync/`

When work touches TIP, Magatama, LLM Gateway, bridges, auth, or shared Erik infrastructure, read both:

- `transceiver-db/sync/CURRENT.md`
- `llm-gateway/sync/CURRENT.md`

## Latest Work

- MAGATAMA cross-repo state from the same chat is now synced into this handoff:
  - Compliance framework cards in MAGATAMA are clickable and open per-framework requirement details.
  - MAGATAMA training status was corrected so `New Since Last Training` no longer falsely shows `0`.
  - Live verified/deduped MAGATAMA training state after the fix:
    - `collectedExamples: 49`
    - `rawExamples: 58`
    - `duplicateExamples: 9`
    - `effectiveExamples: 49`
    - `newSinceLastTraining: 49`
  - MAGATAMA now filters training metrics to verified/trainable examples only.
  - Failed/escalated MAGATAMA remediation records should go to `errors.jsonl`, not the main `fixes.jsonl`, so the next MagatamaLLM run does not train on junk.
  - Gitea-backed training pool remains the default target for training writes.
- Complete Codex chat sync was added:
  - `sync/history/2026-04-29-codex-complete-chat-sync.md`
  - captures Ghost/blog updates, LinkedIn voice preferences, LPO/AI-fabric blog edits, Rest-Is-Not-Laziness scheduling replacement, and security notes.
  - confirms no secrets were written into sync.
  - confirms TIP crawler/robot planning remains TIPLLM-only.
  - confirms Erik remains controller/light `erik-safe` only, with heavy crawler work assigned to Proxmox/Pi workers.
- Codex sync-start confirmation was added:
  - `sync/history/2026-04-29-codex-sync-start-confirmation.md`
  - confirms Codex read this TIP handoff, checked the sibling LLM Gateway handoff, and is treating `sync/` as binding.
  - no code changes, crawler jobs, queue waves, PM2 restarts, or Erik load were initiated during this confirmation.
- Codex follow-up on 2026-04-29 clarified the active BlogLLM model:
  - TIP shows `fo-blog-v7`, but this is not a normal Ollama GGUF manifest.
  - It is a local Adapter Bridge / Mac Studio model backed by the RunPod-trained PEFT adapter:
    `/Users/renefichtmueller/Desktop/Claude Code/magatama/training-data/runpod/pod-runs/2026-04-25-fo-tip/final/adapters/fo_blogllm/final-adapter`
  - Bridge definition:
    `/Users/renefichtmueller/Desktop/Claude Code/magatama/scripts/ollama_adapter_bridge.py`
  - TIP API default:
    `packages/api/src/llm/client.ts` uses `OLLAMA_LLM_MODEL || "fo-blog-v7"`.
  - `fo-blog-v8` remains the next training candidate, not the currently active TIP BlogLLM model.
- Full Codex session handoff was added:
  - `sync/history/2026-04-29-codex-full-session-handoff.md`
  - covers TIP verification, product image/detail crawling, Blog Engine Hot Topics, TIPLLM robots, training pool, Erik status, and cross-repo sync.
- Added a verification robot controller:
  - `packages/scraper/src/robots/verification-robots.ts`
  - command: `npm run robots:verification -w packages/scraper -- --status`
- Added TIPLLM robot experience writing:
  - `packages/scraper/src/crawler-llm/training-data-writer.ts`
  - writes raw robot audit rows and SFT records.
- Added Gitea training pool import to TIP learning-pool build:
  - `scripts/tip-learning-pool-build.ts`
  - imports `TIP_TRAINING_REPO/qa-pairs/*.jsonl` into the `tip_llm` lane.
- Added docs:
  - `docs/TIP_SELFLEARNING_WORKFLOW.md`
- Added package script:
  - `packages/scraper/package.json`
  - `robots:verification`

## Gitea Training Pool

- Existing local clone: `/tmp/tip-training-data`
- Gitea repo: `rene/tip-training-data`
- Latest pushed training commit:
  - `f1c83f8 crawl: add robot-status training records [2026-04-29T20:11:24.091Z]`
- First robot experience record was written to:
  - `/tmp/tip-training-data/qa-pairs/robot-control-high.jsonl`
  - `/tmp/tip-training-data/robot-experiences/2026-04-29.jsonl`

## MAGATAMA Training / Operations State

- Relevant local repo:
  - `/Users/renefichtmueller/Desktop/Claude Code/magatama`
- Latest confirmed live MAGATAMA training metric after dashboard fix:
  - `newSinceLastTraining: 49`
- Meaning:
  - the old `0` was incorrect.
  - the currently visible trainable MAGATAMA corpus is based on verified and deduplicated examples only.
- Important training integrity rule:
  - report-only or failed/escalated records must not be treated as verified training fixes.
  - keep them separated from the main verified training corpus.

## Erik Status

- Synced TIPLLM robot/training code to `/opt/tip`.
- Did not start crawler jobs.
- Did not enqueue robot waves.
- Did not restart PM2 services.
- Remote scraper TypeScript build is passing after removing two stale misplaced remote-only duplicate files:
  - `/opt/tip/packages/scraper/src/scrapers/scheduler.ts`
  - `/opt/tip/packages/scraper/src/vendor-discovery-crawler.ts`
- `tip-api` and `tip-scraper-daemon` are online.
- Shared Erik note from the same chat:
  - MAGATAMA dashboard/core were redeployed during compliance/training fixes.
  - TIP crawler policy remains unchanged: Erik is controller/light runner only, not heavy crawl execution host.

## Last Live Verification Snapshot

From 2026-04-29:

- Total transceivers: `13,546`
- Price verified: `7,250`
- Image verified: `7,025`
- Details verified: `6,243`
- Fully verified: `5,812`
- Last price observation: `2026-04-29 19:15:53 UTC`
- Last stock observation: `2026-04-29 19:15:56 UTC`

## Safe Next Steps

1. Clone or pull Gitea `origin` on laptop/Claude Code.
2. Read this folder first.
3. For BlogLLM work, treat `fo-blog-v7` as Adapter Bridge / PEFT adapter, not as a `~/.ollama` GGUF model.
4. Also read `llm-gateway/sync/CURRENT.md` when work touches shared Erik infrastructure, LLM routing, bridges, auth, TIPLLM, or crawler orchestration.
5. For TIP robot/crawler planning, use TIPLLM only. Do not route this lane through external AI providers.
6. When training pools or model stats look suspicious, prefer verified-only counts and check whether failed/escalated rows polluted the corpus.
7. For MAGATAMA-adjacent work, keep writing learnings back into the Gitea-backed pool and avoid training on report-only pseudo-fixes.
8. If testing robots, start with dry runs only:

```bash
npm run robots:verification -w packages/scraper -- --status
npm run robots:verification -w packages/scraper -- --tipllm-plan --limit=3
npm run robots:verification -w packages/scraper -- --enqueue=details-fast-lane --profile=erik-safe --dry-run
```

9. Only dispatch real crawl work after deciding the target host:
   - Erik: `erik-safe`, tiny batches only.
   - Pi: `pi-fetch`.
   - Proxmox: `proxmox-heavy`.

## Dirty Worktree Note

There are existing uncommitted changes outside `sync/`. Some are Codex work from this session, some appear pre-existing or from earlier Claude/Codex work. Do not blindly revert them. Review `git status --short` before committing broader changes.

## Latest Sync Commits

- `6c42ca7 docs: add shared agent sync handoff`
- `8e7c5aa docs: link llm-gateway sync handoff`
- Pending after this update:
  - push the refreshed complete-chat sync including MAGATAMA training/compliance state.