transceiver-db/docs/TIP_SELFLEARNING_WORKFLOW.md

2.2 KiB

TIP Selflearning Workflow

TIP now has two separate learning lanes:

  • TIP_LLM: research, crawler planning, vendor/market intelligence and data preparation.
  • Blog_LLM: FO_BlogLLM/founder content and practical technical blog generation.

Commands:

npm run learning-pool:build
npm run learning-pool:publish-hf

Dashboard/API:

  • GET /api/selflearning/status
  • POST /api/selflearning/build
  • POST /api/selflearning/publish-hf
  • POST /api/selflearning/train with { "lane": "tip_llm"|"blog_llm", "provider": "runpod"|"local" }

Secrets are read from environment variables or macOS Keychain, never from committed files:

  • RunPod: RUNPOD_API_KEY / TIP_RUNPOD_API_KEY, Keychain magatama.runpod.api / tip.runpod.api
  • Hugging Face: HF_TOKEN / HUGGINGFACE_TOKEN, Keychain magatama.huggingface.token / tip.huggingface.token
  • Endpoint: TIP_RUNPOD_ENDPOINT_ID or RUNPOD_ENDPOINT_ID

Default private Hugging Face datasets:

  • renefichtmueller/tip-llm-sft
  • renefichtmueller/blog-llm-sft

Local training is enabled by setting TIP_LOCAL_TRAIN_COMMAND; the API appends the lane name automatically.

TIPLLM Robot Experience Pool

Crawler and verification robots must use TIPLLM only for planning/extraction feedback. Operational experience is written to the Gitea-backed TIP training pool:

  • Default local clone: /tmp/tip-training-data
  • Override: TIP_TRAINING_REPO=/path/to/tip-training-data
  • Gitea repo: rene/tip-training-data
  • SFT records: qa-pairs/robot-control-high.jsonl
  • Raw audit records: robot-experiences/YYYY-MM-DD.jsonl

Useful commands:

npm run robots:verification -w packages/scraper -- --status
npm run robots:verification -w packages/scraper -- --tipllm-plan --limit=5
npm run robots:verification -w packages/scraper -- --enqueue=details-fast-lane --profile=erik-safe --dry-run

Safety defaults:

  • erik-safe is the default profile and caps to 3 lightweight queues.
  • Playwright/discovery work belongs on Proxmox or Pi workers, not Erik.
  • Every status snapshot, TIPLLM plan, dry-run plan, enqueue result and crawler result should become a TIPLLM training example.
  • learning-pool:build automatically imports Gitea pool SFT rows from qa-pairs/ into the tip_llm lane.