sync: record immediate equivalence revalidation
This commit is contained in:
parent
49f0871720
commit
7da78a999d
@ -1,9 +1,68 @@
|
||||
# Current TIP Sync State
|
||||
|
||||
Updated: 2026-05-09 11:59 UTC
|
||||
Updated: 2026-05-09 12:16 UTC
|
||||
|
||||
## Newest Work
|
||||
|
||||
- Immediate full TIP equivalence revalidation on 2026-05-09:
|
||||
- operator requested all open TIP validation to be completed immediately and all product matches checked for true 1:1 equivalence
|
||||
- live preflight:
|
||||
- equivalence queue: `pending=0`, `approved=1986`, `auto_approved=32080`, `rejected=148367`, `due_research=0`
|
||||
- active matches scheduled for future 30-day recheck: `34066`
|
||||
- strict DB preflight over all active matches found:
|
||||
- no recent-price gaps: `0`
|
||||
- hard technical mismatches: `0`
|
||||
- missing critical 1:1 evidence: `0`
|
||||
- hard criteria checked: form factor, speed, fiber type, reach ratio, primary wavelength and recent competitor price evidence
|
||||
- action:
|
||||
- marked all `34066` active `approved/auto_approved` equivalences as due immediately
|
||||
- queued `18` existing PgBoss `maintenance:re-research-equivalences` jobs
|
||||
- used the existing DB-only TIP re-research worker; no browser crawler wave and no external AI
|
||||
- result:
|
||||
- all `18/18` jobs completed
|
||||
- `due_research=0`
|
||||
- `active_researched_today=34066`
|
||||
- no automated-research rejections in this immediate pass
|
||||
- final equivalence queue: `pending=0`, `approved=1986`, `auto_approved=32080`, `rejected=148367`
|
||||
- transceiver verification counters after the pass:
|
||||
- `competitor_verified=11470`
|
||||
- `price_verified=11557`
|
||||
- `image_verified=10711`
|
||||
- `details_verified=9929`
|
||||
- `fully_verified=9135`
|
||||
- total transceivers `17647`
|
||||
- TIP health after run:
|
||||
- status `healthy`
|
||||
- load status `ok`
|
||||
- memory used `13%`
|
||||
- API/DB connected
|
||||
- truth:
|
||||
- the manual equivalence queue is empty and all active matches have just been rechecked by deterministic 1:1 evidence rules
|
||||
- this does not mean every product row in TIP is complete; largest product verification gaps remain vendor-specific crawler/enrichment work, especially ATGBICS, NADDOD, GAO Tek, Juniper/Cisco, Ascent/Eoptolink and other vendor/catalog rows
|
||||
|
||||
- Crawlee integration/binding on 2026-05-09:
|
||||
- operator asked to install, use and bind Crawlee/Crawlee-Python after priority evaluation
|
||||
- pushed TIP commits:
|
||||
- `60531b6 feat: add crawlee python worker integration`
|
||||
- `49f0871 chore: ignore crawlee python build artifacts`
|
||||
- TypeScript TIP core remains the production crawler core using `crawlee` and Playwright
|
||||
- added scraper scripts:
|
||||
- `pnpm -C packages/scraper scrape:fs:db-detail`
|
||||
- `pnpm -C packages/scraper scrape:fs:url-discovery`
|
||||
- added optional isolated Python worker:
|
||||
- `packages/crawlee-python/`
|
||||
- `scripts/setup-crawlee-python-worker.sh`
|
||||
- `docs/TIP_CRAWLEE_RUNTIME.md`
|
||||
- Python worker policy:
|
||||
- Crawlee-Python is for Pi/Proxmox/residential side workers and extraction experiments
|
||||
- writes JSONL evidence only
|
||||
- no direct DB writes
|
||||
- no replacement for the TypeScript TIP scraper core
|
||||
- smoke test:
|
||||
- installed `crawlee==1.6.3` into `/tmp/tip-crawlee-python-venv`
|
||||
- ran `tip_crawlee_worker` against `https://crawlee.dev`
|
||||
- JSONL evidence output succeeded
|
||||
|
||||
- Priority Crawlee evaluation + FS.com URL discovery on 2026-05-09:
|
||||
- operator asked whether these repos help:
|
||||
- `https://github.com/apify/crawlee`
|
||||
|
||||
@ -0,0 +1,106 @@
|
||||
# TIP Immediate Equivalence Revalidation + Crawlee Binding
|
||||
|
||||
Date: 2026-05-09
|
||||
Actor: Codex
|
||||
|
||||
## Operator Request
|
||||
|
||||
The operator asked to immediately verify and validate all open TIP work and to check whether products really match 1:1. The operator also asked to install, use and bind Crawlee/Crawlee-Python, with all crawler/scraper/robot learning recorded for TIPLLM.
|
||||
|
||||
## Crawlee Binding
|
||||
|
||||
Pushed to Gitea:
|
||||
|
||||
- `60531b6 feat: add crawlee python worker integration`
|
||||
- `49f0871 chore: ignore crawlee python build artifacts`
|
||||
|
||||
Added:
|
||||
|
||||
- `packages/crawlee-python/`
|
||||
- `scripts/setup-crawlee-python-worker.sh`
|
||||
- `docs/TIP_CRAWLEE_RUNTIME.md`
|
||||
- scraper scripts:
|
||||
- `pnpm -C packages/scraper scrape:fs:db-detail`
|
||||
- `pnpm -C packages/scraper scrape:fs:url-discovery`
|
||||
|
||||
Policy:
|
||||
|
||||
- TypeScript Crawlee/Playwright remains the TIP production crawler core.
|
||||
- Crawlee-Python is optional for Pi/Proxmox/residential workers and writes JSONL evidence only.
|
||||
- Crawlee-Python does not write directly to TIP DB.
|
||||
- No external AI was used.
|
||||
|
||||
Smoke test:
|
||||
|
||||
- Installed `crawlee==1.6.3` in `/tmp/tip-crawlee-python-venv`.
|
||||
- Ran `tip_crawlee_worker` against `https://crawlee.dev`.
|
||||
- JSONL evidence output succeeded.
|
||||
|
||||
## Equivalence Revalidation
|
||||
|
||||
Preflight:
|
||||
|
||||
- `pending=0`
|
||||
- `approved=1986`
|
||||
- `auto_approved=32080`
|
||||
- `rejected=148367`
|
||||
- `due_research=0`
|
||||
- active approved/auto-approved matches: `34066`
|
||||
|
||||
Strict DB preflight over all active matches:
|
||||
|
||||
- no recent-price gaps: `0`
|
||||
- hard technical mismatches: `0`
|
||||
- missing critical 1:1 evidence: `0`
|
||||
|
||||
Hard criteria checked:
|
||||
|
||||
- recent competitor price evidence
|
||||
- form factor
|
||||
- speed
|
||||
- fiber type
|
||||
- reach ratio
|
||||
- primary wavelength
|
||||
|
||||
Action:
|
||||
|
||||
- Marked all `34066` active `approved/auto_approved` equivalences as immediately due.
|
||||
- Queued `18` PgBoss jobs for `maintenance:re-research-equivalences`.
|
||||
- Used the existing DB-only TIP research worker.
|
||||
- No browser crawler wave was started.
|
||||
|
||||
Result:
|
||||
|
||||
- `18/18` jobs completed.
|
||||
- `due_research=0`
|
||||
- `active_researched_today=34066`
|
||||
- no automated-research rejections in this immediate pass
|
||||
- final queue:
|
||||
- `pending=0`
|
||||
- `approved=1986`
|
||||
- `auto_approved=32080`
|
||||
- `rejected=148367`
|
||||
|
||||
Final product verification counters:
|
||||
|
||||
- `competitor_verified=11470`
|
||||
- `price_verified=11557`
|
||||
- `image_verified=10711`
|
||||
- `details_verified=9929`
|
||||
- `fully_verified=9135`
|
||||
- total transceivers: `17647`
|
||||
|
||||
TIP health after run:
|
||||
|
||||
- status: `healthy`
|
||||
- load status: `ok`
|
||||
- memory used: `13%`
|
||||
- API/DB connected
|
||||
|
||||
## Truth For Next Agent
|
||||
|
||||
The manual equivalence queue is empty and all active equivalence matches have just been rechecked by deterministic 1:1 rules.
|
||||
|
||||
This does not mean every product row in TIP is fully complete. Product verification gaps remain vendor-specific crawler/enrichment work. Largest remaining gaps are outside the already-focused Flexoptix and FS.com passes, especially ATGBICS, NADDOD, GAO Tek, Juniper/Cisco, Ascent Optics, Eoptolink and other vendor/catalog rows.
|
||||
|
||||
Do not start a broad browser crawler wave on Erik. Continue vendor-targeted, low-concurrency jobs or move heavier discovery to Pi/Proxmox workers.
|
||||
Loading…
x
Reference in New Issue
Block a user