59 lines
2.2 KiB
Markdown
59 lines
2.2 KiB
Markdown
# TIP Selflearning Workflow
|
|
|
|
TIP now has two separate learning lanes:
|
|
|
|
- `TIP_LLM`: research, crawler planning, vendor/market intelligence and data preparation.
|
|
- `Blog_LLM`: FO_BlogLLM/founder content and practical technical blog generation.
|
|
|
|
Commands:
|
|
|
|
```bash
|
|
npm run learning-pool:build
|
|
npm run learning-pool:publish-hf
|
|
```
|
|
|
|
Dashboard/API:
|
|
|
|
- `GET /api/selflearning/status`
|
|
- `POST /api/selflearning/build`
|
|
- `POST /api/selflearning/publish-hf`
|
|
- `POST /api/selflearning/train` with `{ "lane": "tip_llm"|"blog_llm", "provider": "runpod"|"local" }`
|
|
|
|
Secrets are read from environment variables or macOS Keychain, never from committed files:
|
|
|
|
- RunPod: `RUNPOD_API_KEY` / `TIP_RUNPOD_API_KEY`, Keychain `magatama.runpod.api` / `tip.runpod.api`
|
|
- Hugging Face: `HF_TOKEN` / `HUGGINGFACE_TOKEN`, Keychain `magatama.huggingface.token` / `tip.huggingface.token`
|
|
- Endpoint: `TIP_RUNPOD_ENDPOINT_ID` or `RUNPOD_ENDPOINT_ID`
|
|
|
|
Default private Hugging Face datasets:
|
|
|
|
- `renefichtmueller/tip-llm-sft`
|
|
- `renefichtmueller/blog-llm-sft`
|
|
|
|
Local training is enabled by setting `TIP_LOCAL_TRAIN_COMMAND`; the API appends the lane name automatically.
|
|
|
|
## TIPLLM Robot Experience Pool
|
|
|
|
Crawler and verification robots must use TIPLLM only for planning/extraction feedback. Operational experience is written to the Gitea-backed TIP training pool:
|
|
|
|
- Default local clone: `/tmp/tip-training-data`
|
|
- Override: `TIP_TRAINING_REPO=/path/to/tip-training-data`
|
|
- Gitea repo: `rene/tip-training-data`
|
|
- SFT records: `qa-pairs/robot-control-high.jsonl`
|
|
- Raw audit records: `robot-experiences/YYYY-MM-DD.jsonl`
|
|
|
|
Useful commands:
|
|
|
|
```bash
|
|
npm run robots:verification -w packages/scraper -- --status
|
|
npm run robots:verification -w packages/scraper -- --tipllm-plan --limit=5
|
|
npm run robots:verification -w packages/scraper -- --enqueue=details-fast-lane --profile=erik-safe --dry-run
|
|
```
|
|
|
|
Safety defaults:
|
|
|
|
- `erik-safe` is the default profile and caps to 3 lightweight queues.
|
|
- Playwright/discovery work belongs on Proxmox or Pi workers, not Erik.
|
|
- Every status snapshot, TIPLLM plan, dry-run plan, enqueue result and crawler result should become a TIPLLM training example.
|
|
- `learning-pool:build` automatically imports Gitea pool SFT rows from `qa-pairs/` into the `tip_llm` lane.
|