# TIP Crawlee Runtime ## Decision TIP standardizes on Crawlee as the crawler runtime. - Production TypeScript path: `packages/scraper` with `apify/crawlee` and Playwright. - Optional Python worker path: `packages/crawlee-python` with `apify/crawlee-python`. ## TypeScript Core The TypeScript scraper remains the canonical production path because TIP already uses it for DB writes, price observations, stock observations, image verification and detail verification. Useful FS.com commands: ```bash pnpm -C packages/scraper run scrape:fs:db-detail pnpm -C packages/scraper run scrape:fs:url-discovery ``` Erik safety defaults: - keep FS.com at browser concurrency `1` - use bounded run caps - treat no-text and max-retry URLs as retry/classification classes - keep Crawlee storage isolated with `makeCrawleeConfig(...)` ## Python Worker The Python worker is optional and should run first on Pi/Proxmox/residential nodes. It writes JSONL evidence and does not write directly into TIP DB. Install: ```bash cd packages/crawlee-python python3 -m venv .venv . .venv/bin/activate python -m pip install -U pip python -m pip install -e ".[beautifulsoup]" ``` Smoke: ```bash python -m tip_crawlee_worker \ --mode beautifulsoup \ --url https://crawlee.dev \ --out /tmp/tip-crawlee-python-smoke.jsonl \ --max-requests 1 ``` ## Training Pool Every crawler result, failure class, parser lesson and runtime safety lesson should be written to the TIPLLM training pool and synced through `sync/`.