2.8 KiB
2.8 KiB
TIP Equivalence Automated Research
Date: 2026-05-09
Goal
Remove manual equivalence validation as a required workflow for TIP product verification. Low-confidence matches should be researched and either confirmed or rejected automatically.
Findings
- The dashboard had a large
Approved + Re-Researchbacklog. approve-allwas marking low-confidence rows approved, then settingre_research_due_at.- The re-research worker only checked whether the competitor still had a recent price; it did not re-check technical equivalence quality.
- Many low-confidence rows were objectively bad matches:
- reach mismatches
- wavelength mismatches
- missing reach evidence
- fiber mismatches
Code Changes
-
packages/api/src/routes/review.tsapprove-allnow approves only confidence >=0.73.- Weak rows stay pending and get queued for automated research.
needs_researchincludes pending research rows.- Added
POST /api/review/run-research.
-
packages/scraper/src/scheduler.ts- Added deterministic equivalence evaluator.
- Confirms matches only when there is:
- recent competitor price
- matching form factor
- matching speed
- matching fiber type
- matching wavelength
- compatible reach
- confidence >=
0.73
- Rejects stale, incomplete, contradictory, or low-confidence matches automatically.
- Confirmed matches get a 30-day recheck.
Deployment
- Synced code to Erik
/opt/tip. - Built on Erik:
pnpm -C packages/api buildpnpm -C packages/scraper build
- Restarted:
tip-apitip-scraper-daemon
- Both were online after restart.
Live Data Cleanup
No heavy crawler wave was started. Cleanup used existing crawled specs and price observations.
Processed pending + due re-research:
- total:
144103 - rejected fiber mismatch:
958 - rejected reach mismatch:
82128 - rejected missing reach evidence:
31151 - rejected wavelength mismatch:
29865 - rejected low confidence:
1
Processed old approved rows:
- confirmed:
1986 - rejected fiber mismatch:
184 - rejected reach mismatch:
1704 - rejected missing reach evidence:
1117 - rejected wavelength mismatch:
993 - rejected low confidence:
2
Processed old auto-approved rows:
- confirmed:
32080 - rejected reach mismatch:
260
Final State
- pending:
0 - approved:
1986 - auto_approved:
32080 - rejected:
148367 - due re-research now:
0 - scheduled 30-day rechecks:
34066
Product verification counters after reconcile:
- competitor_verified:
11137 - fully_verified:
290 - price_verified:
11549 - image_verified:
10629 - details_verified:
9538
Next Work
Products rejected for missing reach/details should be enriched by targeted vendor crawlers. Keep Erik light; use Proxmox/Pi workers for heavier crawl waves. TIPLLM-only policy remains active for crawler/robot research and learning records.