transceiver-db/docker-compose.yml
Rene Fichtmueller e9fb50a248 feat: TIP Phase 0+1 — monorepo, DB schema, API, scraper engine
Phase 0 - Foundation:
- Restructure into npm workspace monorepo (packages/core, api, scraper)
- PostgreSQL 17 + TimescaleDB schema (15 tables incl. hypertables)
- Docker Compose for local dev (PostgreSQL on 5433 + Qdrant)
- Express 5 API on port 3200 with 6 routes
- Seed script to migrate 159 transceivers + 42 standards from npm package
- Erik server setup script + PM2 ecosystem config

Phase 1 - Scraper Engine:
- Crawlee + Playwright framework with pg-boss scheduler
- FS.com scraper (PlaywrightCrawler, anti-bot workaround)
- Optcore.net scraper (WP REST API enumeration + PlaywrightCrawler)
  - Uses /wp-json/wp/v2/product to get 2000+ product URLs
  - Playwright renders individual product pages for price extraction
- Cisco TMG Matrix scraper (compatibility data)
- News RSS aggregator (optics.org, SPIE, Network World, Nature Photonics)
  - Keyword relevance scoring for transceiver/fiber topics
  - xml2js with malformed XML sanitization
- SHA-256 content hashing for change detection (skip unchanged records)
- pg-boss v10 with explicit queue creation before scheduling
2026-03-27 16:27:31 +13:00

34 lines
769 B
YAML

services:
postgres:
image: timescale/timescaledb:latest-pg17
container_name: tip-postgres
environment:
POSTGRES_DB: transceiver_db
POSTGRES_USER: tip
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-tip_dev_2026}
ports:
- "5433:5432"
volumes:
- tip_pgdata:/var/lib/postgresql/data
- ./sql:/docker-entrypoint-initdb.d
healthcheck:
test: ["CMD-SHELL", "pg_isready -U tip -d transceiver_db"]
interval: 5s
timeout: 5s
retries: 5
qdrant:
image: qdrant/qdrant:latest
container_name: tip-qdrant
ports:
- "6333:6333"
- "6334:6334"
volumes:
- tip_qdrant:/qdrant/storage
environment:
QDRANT__SERVICE__GRPC_PORT: 6334
volumes:
tip_pgdata:
tip_qdrant: