Rene Fichtmueller b43bdd3060 feat: TIP Phase 0+1 — monorepo, DB schema, API, scraper engine
Phase 0 - Foundation:
- Restructure into npm workspace monorepo (packages/core, api, scraper)
- PostgreSQL 17 + TimescaleDB schema (15 tables incl. hypertables)
- Docker Compose for local dev (PostgreSQL on 5433 + Qdrant)
- Express 5 API on port 3200 with 6 routes
- Seed script to migrate 159 transceivers + 42 standards from npm package
- Erik server setup script + PM2 ecosystem config

Phase 1 - Scraper Engine:
- Crawlee + Playwright framework with pg-boss scheduler
- FS.com scraper (PlaywrightCrawler, anti-bot workaround)
- Optcore.net scraper (WP REST API enumeration + PlaywrightCrawler)
  - Uses /wp-json/wp/v2/product to get 2000+ product URLs
  - Playwright renders individual product pages for price extraction
- Cisco TMG Matrix scraper (compatibility data)
- News RSS aggregator (optics.org, SPIE, Network World, Nature Photonics)
  - Keyword relevance scoring for transceiver/fiber topics
  - xml2js with malformed XML sanitization
- SHA-256 content hashing for change detection (skip unchanged records)
- pg-boss v10 with explicit queue creation before scheduling
2026-03-27 16:27:31 +13:00

46 lines
851 B
JSON

{
"name": "@tip/core",
"version": "1.0.0",
"description": "Core optical transceiver database. 159 products, 42 IEEE/MSA standards, 16 form factors, 9 speed tiers.",
"main": "dist/index.js",
"types": "dist/index.d.ts",
"scripts": {
"build": "tsc",
"prepublishOnly": "npm run build"
},
"license": "MIT",
"keywords": [
"transceiver",
"optics",
"sfp",
"qsfp",
"networking",
"fiber",
"ieee",
"telecom",
"osfp",
"qsfp-dd",
"optical",
"datacenter",
"100g",
"400g",
"800g"
],
"files": [
"dist",
"LICENSE",
"README.md"
],
"repository": {
"type": "git",
"url": "https://github.com/renefichtmueller/transceiver-db"
},
"author": "Rene Fichtmueller",
"engines": {
"node": ">=14"
},
"devDependencies": {
"typescript": "^5.9.3"
}
}