Phase 0 - Foundation: - Restructure into npm workspace monorepo (packages/core, api, scraper) - PostgreSQL 17 + TimescaleDB schema (15 tables incl. hypertables) - Docker Compose for local dev (PostgreSQL on 5433 + Qdrant) - Express 5 API on port 3200 with 6 routes - Seed script to migrate 159 transceivers + 42 standards from npm package - Erik server setup script + PM2 ecosystem config Phase 1 - Scraper Engine: - Crawlee + Playwright framework with pg-boss scheduler - FS.com scraper (PlaywrightCrawler, anti-bot workaround) - Optcore.net scraper (WP REST API enumeration + PlaywrightCrawler) - Uses /wp-json/wp/v2/product to get 2000+ product URLs - Playwright renders individual product pages for price extraction - Cisco TMG Matrix scraper (compatibility data) - News RSS aggregator (optics.org, SPIE, Network World, Nature Photonics) - Keyword relevance scoring for transceiver/fiber topics - xml2js with malformed XML sanitization - SHA-256 content hashing for change detection (skip unchanged records) - pg-boss v10 with explicit queue creation before scheduling
46 lines
851 B
JSON
46 lines
851 B
JSON
{
|
|
"name": "@tip/core",
|
|
"version": "1.0.0",
|
|
"description": "Core optical transceiver database. 159 products, 42 IEEE/MSA standards, 16 form factors, 9 speed tiers.",
|
|
"main": "dist/index.js",
|
|
"types": "dist/index.d.ts",
|
|
"scripts": {
|
|
"build": "tsc",
|
|
"prepublishOnly": "npm run build"
|
|
},
|
|
"license": "MIT",
|
|
"keywords": [
|
|
"transceiver",
|
|
"optics",
|
|
"sfp",
|
|
"qsfp",
|
|
"networking",
|
|
"fiber",
|
|
"ieee",
|
|
"telecom",
|
|
"osfp",
|
|
"qsfp-dd",
|
|
"optical",
|
|
"datacenter",
|
|
"100g",
|
|
"400g",
|
|
"800g"
|
|
],
|
|
"files": [
|
|
"dist",
|
|
"LICENSE",
|
|
"README.md"
|
|
],
|
|
"repository": {
|
|
"type": "git",
|
|
"url": "https://github.com/renefichtmueller/transceiver-db"
|
|
},
|
|
"author": "Rene Fichtmueller",
|
|
"engines": {
|
|
"node": ">=14"
|
|
},
|
|
"devDependencies": {
|
|
"typescript": "^5.9.3"
|
|
}
|
|
}
|