transceiver-db/run-fs-scraper-mac.sh
Rene Fichtmueller f8809d999f feat(scraper+api): warehouse stock data pipeline — FS.com v2, SmartOptics v2, Stock API
Scraper changes:
- fs-com.ts v2: Playwright stealth patches + www.fs.com/de/ URL fix (de.fs.com DNS NXDOMAIN).
  Extracts DE-Lager, Global-Lager, Nachlieferung, units_sold, compatible_brands, price_net.
  Mac-side runner (run-fs-scraper-mac.sh) via SSH tunnel for residential IP access.
  Fast-fail connectivity check on datacenter IPs that are blocked by Cloudflare.
- smartoptics.ts v2: WooCommerce REST API fallback + 8 catalog categories + relative URL fix.
  Was finding only 8 products, now discovers 18+ with multi-category crawl.

DB layer:
- db.ts: add upsertStockObservation() — writes 10 new stock_observations columns
  (warehouse_de_qty, warehouse_global_qty, backorder_qty, units_sold, compatible_brands,
  price_net, product_url, delivery dates) with dedup check.

API:
- routes/stock.ts: GET /api/stock, /api/stock/summary, /api/stock/:id
  Warehouse breakdowns per transceiver/vendor with top-sellers and vendor summary.
- routes/review.ts: equivalence review queue (approve/reject/bulk-approve).
- index.ts: register /api/stock and /api/review routes.

Dashboard:
- index.html: 🏭 Stock tab with stat cards (DE-Lager, Global-Lager, Nachlieferung totals),
  top-sellers table, vendor breakdown, recently-restocked events, part-number lookup.

SQL migrations:
- 034: blog-review-tag, 035: price-observations is_anomalous, 036: transceiver-equivalences.
2026-04-17 10:45:59 +02:00

67 lines
2.2 KiB
Bash
Executable File

#!/bin/bash
# FS.com Scraper — Mac-side runner
# Runs from this Mac (residential IP) so FS.com isn't blocked.
# Opens SSH tunnel to Erik's DB → runs scraper → closes tunnel.
#
# Schedule: launchd at 02:00, 10:00, 18:00 daily
# Log: ~/Library/Logs/tip-fs-scraper.log
set -euo pipefail
LOG="$HOME/Library/Logs/tip-fs-scraper.log"
REPO="/Users/renefichtmueller/Desktop/Claude Code/github-repos/transceiver-db"
NODE="/opt/homebrew/bin/node"
NPX="/opt/homebrew/bin/npx"
TUNNEL_PID_FILE="/tmp/tip-db-tunnel.pid"
DB_LOCAL_PORT=5433
log() { echo "[$(date '+%Y-%m-%d %H:%M:%S')] $*" | tee -a "$LOG"; }
# ── Open SSH tunnel if not already running ────────────────────────────────────
open_tunnel() {
if [ -f "$TUNNEL_PID_FILE" ]; then
PID=$(cat "$TUNNEL_PID_FILE")
if kill -0 "$PID" 2>/dev/null; then
log "Tunnel already running (PID $PID)"
return 0
fi
fi
log "Opening SSH tunnel → Erik PostgreSQL on port $DB_LOCAL_PORT"
ssh -N -f -L "${DB_LOCAL_PORT}:localhost:${DB_LOCAL_PORT}" erik
# -f forks to background, no PID tracking needed — use pkill to close
log "Tunnel opened"
sleep 2 # Give the tunnel a moment to establish
}
close_tunnel() {
log "Closing SSH tunnel…"
pkill -f "ssh -N -f -L ${DB_LOCAL_PORT}:localhost:${DB_LOCAL_PORT}" 2>/dev/null || true
rm -f "$TUNNEL_PID_FILE"
}
# ── Main ──────────────────────────────────────────────────────────────────────
mkdir -p "$(dirname "$LOG")"
log "=== FS.com Mac Scraper starting ==="
# Only close tunnel if we opened it (not if one was already running)
OPENED_TUNNEL=0
if ! pgrep -f "ssh -N.*${DB_LOCAL_PORT}:localhost" >/dev/null 2>&1; then
open_tunnel
OPENED_TUNNEL=1
trap close_tunnel EXIT
fi
cd "$REPO"
export POSTGRES_HOST=localhost
export POSTGRES_PORT=$DB_LOCAL_PORT
export POSTGRES_DB=transceiver_db
export POSTGRES_USER=tip
export POSTGRES_PASSWORD=tip_prod_2026
export NODE_ENV=production
log "Running fs-com scraper via tsx…"
"$NPX" tsx packages/scraper/src/scrapers/fs-com.ts 2>&1 | tee -a "$LOG"
log "=== FS.com Mac Scraper complete ==="