Compare commits

...

16 Commits

Author SHA1 Message Date
Rene Fichtmueller
92d373d40e data: direct image injection for Nokia, F5, Delta Networks, Siemens, TP-Link
Migration 051: TP-Link TL-SG3452XP + TL-SX3016F via static.tp-link.com CDN
Migration 052: Nokia 6/6 — 7220 IXR-D3L/H4 (docs.nokia.com graphics),
  7250 IXR-10 + 7750 SR-1 (tempestns.com), SR-14s (telecomcauliffe.com),
  SR-1e (docs hardwareBanner — no standalone public image available)
Migration 053: F5 BIG-IP i5800/i10800 (wtit.com), i15800 (blueally CDN)
Migration 054: Delta Networks 4/4 (hardwarenation.com + manualslib),
  Siemens SCALANCE 4/4 — X-200/X-300/X-500 via images.sw.cdn.siemens.com

All 14 URLs verified HTTP 200 with correct image content-type (2026-04-21).
CHANGELOG_PENDING.md updated for all 4 migrations.
2026-04-21 08:42:14 +02:00
Rene Fichtmueller
f275e94a6f data: image coverage improvements — QCT, Allied Telesis + CHANGELOG update
- sql/046: QCT QuantaMesh T3048-LY8 direct image injection
- sql/050: Allied Telesis 3/3 (AT-x530-28GSX, AT-x530L-52GPX, AT-x950-28XSQ)
  via alliedtelesis.com Drupal static files CDN (og:image, all 200 PNG)
- CHANGELOG: document filter pattern fixes + all 6 new vendor migrations
  (Moxa/UfiSpace/Brocade/NVIDIA/Allied Telesis — 19 new images total)
2026-04-21 08:11:57 +02:00
Rene Fichtmueller
51c18212b8 fix: add image filter patterns and direct URL migrations for 6 vendors
- switch-image-playwright.ts + switch-image-fetcher.ts: add filter patterns
  for /webimage-404/ (Netgear 404 hero), /Brand/ + /cybersecurity.png/
  (Moxa brand marketing images not product photos)
- sql/047: Moxa 4/4 models — CDN getattachment paths (hotlink-protected,
  Referer: moxa.com required; R2 proxy needed for production display)
- sql/048: UfiSpace 6/6 models — ufispace.com/image/<hash>/ direct PNGs;
  Brocade G720+G730 — broadcom.com og:image; ICX 7850-48FS — CommScope/Ruckus
  vistancenetworks.com ImageServer (rand param is cache-bust only, not auth)
- sql/049: NVIDIA SN-series 6/6 — docscontent.nvidia.com (SN2201/3700/4700)
  and S3 direct (SN5400/5600); SN3750-SX via uvation reseller CDN
2026-04-21 07:57:55 +02:00
Rene Fichtmueller
403a718119 fix: add error-graphic, icon-library, illustration filters to GENERIC_IMAGE_PATTERNS
- /404[-_]error/i, /error[-_]graphic/i — Broadcom 404-ERROR-GRAPHIC.png
- /\/icon[-_]library\//i — D-Link navigation/icon-library path images
- /[-_]illustration[._]/i — Arista Cloud-Legacy_Illustration and similar diagrams
- Nokia banner, Huawei marketing, banners/ path patterns (Playwright scraper)
- Cookie consent patterns synced to switch-image-fetcher.ts (was only in Playwright)
2026-04-21 07:38:01 +02:00
Rene Fichtmueller
88403eb7eb feat: add 12 new vendor URL builders to Playwright image scraper
- Nokia, Huawei, Ciena, Moxa, D-Link, Alcatel-Lucent Enterprise,
  Asterfusion, Brocade: passthrough builders (use stored product_page_url)
- NVIDIA Networking: SN-series URL builder (sn5600 → /ethernet-switching/sn5600/)
- Netgear: lowercase model slug builder for /business/wired/switches/fully-managed/
- UfiSpace: hardcoded sitemap-verified URL map (all 6 S9xxx models)
- QCT: hardcoded URL map for T3048-LY8 and T7032-IX1
- Add Nokia banner / Huawei marketing image patterns to GENERIC_IMAGE_PATTERNS
2026-04-21 07:24:11 +02:00
Rene Fichtmueller
07e1fc9178 data: inject Edgecore product images directly (Playwright blocked by 403)
Edgecore blocks headless browsers (Playwright 403) but serves og:image via
plain HTTP. 5 models resolved via direct curl extraction:
- DCS204, DCS510, DCS810, EPS203 → edge-core.com/product/<slug>/
- Minipack2 → minipack-as8000-open-modular-platform product page
AS7535/AS7726/AS7946/AS9516 not on Edgecore's public WooCommerce site.
2026-04-21 07:07:24 +02:00
Rene Fichtmueller
53cfebb6f4 fix(scraper): filter OneTrust/cookie-consent images + skip in img fallback
cdn.cookielaw.org logos appear as the largest DOM image on Dell/Extreme
product pages when the cookie consent overlay is present. Added to both
GENERIC_IMAGE_PATTERNS (isGenericImage filter) and img fallback skipPattern
so the next-largest actual product image can be found.
2026-04-21 06:45:41 +02:00
Rene Fichtmueller
fcb8fb8c90 fix(scraper): fall through to img fallback when og:image is generic/logo
Previously: if og:image existed (even as a Dell logo URL), page.evaluate() returned
early and the img fallback was never tried. Now: meta tags are extracted first, then
isGenericImage() is checked in Node.js, and the img fallback runs if meta image is null
or generic. This allows vendors like Dell (og:image = logo) to still get product images
via the DOM fallback.
2026-04-21 06:36:12 +02:00
Rene Fichtmueller
55d4d6a8f8 fix(scraper): fix Edgecore/Extreme URL builders, broaden img fallback, fix ENOENT
- buildEdgecoreUrl: /product/<slug>/ (WooCommerce, no .html) with EDGECORE_SLUG_MAP
  for AS7712-32X→as7712-32x-ec, Minipack2→minipack-as8000-open-modular-platform
- buildFortinetUrl: returns null (all pages redirect to generic, no usable og:image)
- buildExtremeUrl: direct product URL (extremenetworks.com/product/<slug>)
- img fallback: remove strict 'product/switch/router/hardware' path requirement;
  now takes largest image >=200x150px excluding flags/icons/spinners — isGenericImage()
  filters hero/banner/logo afterward
- ENOENT fix: unique per-run Crawlee storage dir (timestamp suffix) prevents
  stale request-queue file contamination between back-to-back vendor runs
2026-04-21 06:33:32 +02:00
Rene Fichtmueller
8e30b49410 fix(scraper): fix Arista series-level URL builder + bypass Crawlee URL deduplication
- buildAristaUrl() now extracts series prefix (7060X5-32QS → 7060x5-series)
  instead of individual model URLs that lack og:image
- Strip trailing sub-variant 'A' so R3A → R3 series page
- Add uniqueKey: row.id to each request — prevents Crawlee from deduplicating
  models that share the same series URL (e.g. 7060x5-series)
- For Arista: always prefer fresh builder URL over stored product_page_url
  so stale individual-model URLs don't override correct series pages
2026-04-21 06:22:41 +02:00
Rene Fichtmueller
2742141c8b feat: Playwright image scraper for bot-blocked vendors (Arista/Dell/Edgecore/Fortinet/Extreme) 2026-04-21 06:16:05 +02:00
Rene Fichtmueller
892da2bcf5 fix: Cisco line card URL mapping (8800/84/86 → 8000 family page, skip ASR9K logo-only) 2026-04-21 00:49:32 +02:00
Rene Fichtmueller
e20bb6cb45 fix: MikroTik hardcoded slug map for + models (crs305/312/317/326) 2026-04-21 00:45:41 +02:00
Rene Fichtmueller
c4585caada fix: Cisco 8000 builder URL + MikroTik lowercase + new vendor builders
URL builder fixes:
- Cisco 8000: update to new /site/us/en/ URL scheme (family page, not per-model)
- MikroTik: fix to lowercase+underscore format (was uppercase, caused 404)
- Fortinet: set to null — JS-rendered pages, all redirect to generic page
- Alcatel-Lucent Enterprise slug added to dispatcher (was missing, caused 0 hits)
- Add Quanta, Allied Telesis, Ufispace, Netgear URL builders
- NVIDIA: skip ConnectX/BlueField non-switch models

Migration 044:
- Clear 35 wrong NCS-5500 URLs from Cisco 8000-series models
- Pre-set correct 8000-series family URL for 21 models without images
2026-04-21 00:41:31 +02:00
Rene Fichtmueller
2b72e1089f fix: monitor-erik.sh — correct Erik SSH target + fix awk header skip 2026-04-21 00:34:28 +02:00
Rene Fichtmueller
ea6ef606d3 feat: more switch image coverage + system health metrics + Erik monitor
switch-image-fetcher:
- Add Fortinet URL builder (11 FortiSwitch models)
- Add Quanta Cloud Technology, Allied Telesis, Ufispace, Netgear URL builders
- Fix alcatel-lucent-enterprise slug missing from URL_BUILDERS dispatcher
- Fix NVIDIA builder to skip ConnectX/BlueField adapters (not switches)
- Add aruba slug alias for hpe-aruba

health endpoint:
- Add system metrics: CPU load (1/5/15m), memory usage, disk usage
- Add load_status indicator (ok/busy/overloaded)
- Expose process RSS memory
- Used by external monitors

scripts/monitor-erik.sh:
- Cron-ready health check script for Claudi (.82) and Raspberry Pis
- Checks TIP API health endpoint (load, memory, disk, DB latency)
- Checks PM2 process state via SSH (errored/stopped detection)
- ntfy.sh push notifications (set NTFY_TOPIC env var)
- Includes systemd service + timer unit comments for auto-install
2026-04-21 00:31:43 +02:00
17 changed files with 1230 additions and 29 deletions

View File

@ -189,3 +189,18 @@ Types: FEAT · FIX · UI · DATA · AI · INFRA
{"d":"2026-04-20","t":"FEAT","m":"community-issues.ts enhanced: added Cisco Field Notices, Juniper KB, SONiC GitHub Issues sources + new scrapeTransceiverCompatIssues() for switch+transceiver combo issues."} {"d":"2026-04-20","t":"FEAT","m":"community-issues.ts enhanced: added Cisco Field Notices, Juniper KB, SONiC GitHub Issues sources + new scrapeTransceiverCompatIssues() for switch+transceiver combo issues."}
{"d":"2026-04-20","t":"UI","m":"Dashboard switch table: thumbnail column (48px lazy-load image with gear-icon fallback). Switch detail: compatibility panel shows verification_method badge, vendor-tested vs form-factor split, competitor pricing in detail rows."} {"d":"2026-04-20","t":"UI","m":"Dashboard switch table: thumbnail column (48px lazy-load image with gear-icon fallback). Switch detail: compatibility panel shows verification_method badge, vendor-tested vs form-factor split, competitor pricing in detail rows."}
{"d":"2026-04-20","t":"FIX","m":"Scrapers: ATGBics new Shopify theme (card__info), NADDOD corrected shop URL, VCELink disabled (site pivoted to audio/video April 2026). Scheduler: 59 schedules, 78 workers."} {"d":"2026-04-20","t":"FIX","m":"Scrapers: ATGBics new Shopify theme (card__info), NADDOD corrected shop URL, VCELink disabled (site pivoted to audio/video April 2026). Scheduler: 59 schedules, 78 workers."}
{"d":"2026-04-21","t":"FEAT","m":"switch-image-playwright.ts: Playwright image scraper for bot-blocked switch vendors (Arista, Dell, Edgecore, Fortinet, HPE-Aruba, Extreme) — stealth headless Chromium, per-vendor URL builders (series-level for Arista, WooCommerce for Edgecore, direct-product for Extreme), og:image→twitter:image→img fallback chain, uniqueKey=row.id to bypass Crawlee URL deduplication for shared series pages, makeCrawleeConfig(Date.now() suffix) per-run to avoid ENOENT from stale request-queue files."}
{"d":"2026-04-21","t":"FIX","m":"Arista image coverage 33%→71%: buildAristaUrl() extracts series slug from model (7060X5-32QS→7060x5-series, 7280R3A→7280r3-series stripping trailing sub-variant 'a'). uniqueKey=row.id forces Crawlee to process all models even when multiple share the same series-level page. 15/21 Arista models now have images; 6 remaining series pages lack og:image in CMS (older models: 7050cx3, 7060dx5, 7060px4, 7060x4, 7170, 7260cx3)."}
{"d":"2026-04-21","t":"FIX","m":"og:image generic-logo fallback: meta image extraction decoupled from img fallback — og:image checked against isGenericImage() in Node.js; if it matches (logo/brand), falls through to img fallback instead of returning early. Fixes Dell (og:image=logo) and Extreme (og:image=logo) pipelines running img fallback as intended."}
{"d":"2026-04-21","t":"FIX","m":"OneTrust/cookie consent image filter: cdn.cookielaw.org, cookiebot.com, trustarc.com, consent-manager added to GENERIC_IMAGE_PATTERNS; cookielaw|cookiebot|trustarc added to img fallback skipPattern — prevents OneTrust company logo (largest DOM image on Extreme product pages) from being selected as product photo."}
{"d":"2026-04-21","t":"DATA","m":"Cisco 8000-series images 0%→100%: migration 044 cleared 35 stale NCS-5500 product_page_urls incorrectly assigned to 8000-series models, then set correct cisco.com/site/us/en/ URLs. switch-image-fetcher.ts plain HTTP run: 32/32 Cisco 8000-series models now have images."}
{"d":"2026-04-21","t":"DATA","m":"Edgecore images 0%→50%: migration 045 injects 5 direct image URLs (DCS204, DCS510, DCS810, EPS203, Minipack2) via curl-extracted og:image from WooCommerce product pages — Playwright blocked by Cloudflare WAF on edge-core.com but plain curl succeeds. AS7xxx enterprise switches not listed on edge-core.com website."}
{"d":"2026-04-21","t":"FIX","m":"Image filter patterns: /webimage-404/ (Netgear 404 hero), /\\/Brand\\// + /cybersecurity\\.png/ (Moxa brand images) added to GENERIC_IMAGE_PATTERNS in both switch-image-playwright.ts and switch-image-fetcher.ts. Cleared 5 bad DB rows (Moxa Brand/cybersecurity.png x4, Netgear webimage-404 x1)."}
{"d":"2026-04-21","t":"DATA","m":"Moxa images 0%→100% (4/4): direct CDN injection via migration 047 — Moxa Azure CDN getattachment paths. Hotlink-protected (Referer: moxa.com required); R2 proxy needed for production display."}
{"d":"2026-04-21","t":"DATA","m":"UfiSpace images 0%→100% (6/6) + Brocade 0%→100% (3/3): migration 048 — UfiSpace ufispace.com/image/<hash>/ PNGs (publicly accessible); Brocade G720/G730 via broadcom.com og:image, ICX 7850-48FS via CommScope/Ruckus vistancenetworks.com ImageServer (rand param cache-bust only, ID hash stable)."}
{"d":"2026-04-21","t":"DATA","m":"NVIDIA Networking images 0%→100% (6/6): migration 049 — SN2201/SN3700/SN4700 via docscontent.nvidia.com official docs CDN, SN5400/SN5600 via k3-prod-nvidia-docs.s3 direct, SN3750-SX via uvation reseller CDN."}
{"d":"2026-04-21","t":"DATA","m":"Allied Telesis images 0%→100% (3/3): migration 050 — x530/x530L/x950 series og:image from alliedtelesis.com Drupal CMS static files. QCT T3048-LY8 image via migration 046. Overall coverage: 33.4%→36.2%+ across 671 switches."}
{"d":"2026-04-21","t":"DATA","m":"TP-Link images 0%→100% (2/2): migration 051 — TL-SG3452XP + TL-SX3016F via static.tp-link.com upload/image-line CDN (og:image pattern with model/region/HW/timestamp)."}
{"d":"2026-04-21","t":"DATA","m":"Nokia images 0%→100% (6/6): migration 052 — 7220 IXR-D3L/H4 via documentation.nokia.com SR Linux docs graphics; 7250 IXR-10 + 7750 SR-1 via tempestns.com model-specific reseller CDN; 7750 SR-14s via telecomcauliffe.com; 7750 SR-1e via docs hardwareBanner (no standalone public image available)."}
{"d":"2026-04-21","t":"DATA","m":"F5 Networks images 0%→100% (3/3): migration 053 — BIG-IP i5800/i10800 via wtit.com reseller CDN (model-specific PNGs), i15800 via cdn.blueally.com bigip-i15000-series composite."}
{"d":"2026-04-21","t":"DATA","m":"Delta Networks images 0%→100% (4/4) + Siemens SCALANCE images 0%→100% (4/4): migration 054 — Delta AG5648/AG9032v2A/AGC7648A via hardwarenation.com, AG9064v2 via manualslib CDN; Siemens XC216-4C (X-200 og:image), XR324-12M (X-300), XM416-4C+XR528-6M (X-500) via images.sw.cdn.siemens.com official DISW CDN."}

View File

@ -1,6 +1,8 @@
import { Router, Request, Response } from "express"; import { Router, Request, Response } from "express";
import { getDbStats } from "../db/queries"; import { getDbStats } from "../db/queries";
import { pool } from "../db/client"; import { pool } from "../db/client";
import { loadavg, totalmem, freemem, cpus } from "os";
import { execSync } from "child_process";
export const healthRouter = Router(); export const healthRouter = Router();
@ -36,11 +38,45 @@ healthRouter.get("/", async (_req: Request, res: Response) => {
`).catch(() => ({ rows: [{}] })); `).catch(() => ({ rows: [{}] }));
const s = stockStats.rows[0] || {}; const s = stockStats.rows[0] || {};
// System metrics
const [load1, load5, load15] = loadavg();
const totalMem = totalmem();
const freeMem = freemem();
const usedMem = totalMem - freeMem;
const coreCount = cpus().length;
let diskUsedPct: number | null = null;
let diskFreeGb: number | null = null;
try {
const df = execSync("df -h / 2>/dev/null | tail -1", { timeout: 2000 }).toString().trim();
const parts = df.split(/\s+/);
diskUsedPct = parseInt(parts[4] ?? "0", 10) || null;
diskFreeGb = parseFloat(parts[3] ?? "0") || null;
} catch { /* skip on systems without df */ }
const loadStatus = load1 > coreCount * 0.9 ? "overloaded" : load1 > coreCount * 0.6 ? "busy" : "ok";
res.json({ res.json({
success: true, success: true,
status: "healthy", status: "healthy",
version: "0.3.0", version: "0.3.0",
uptime: process.uptime(), uptime: process.uptime(),
system: {
load: { "1m": +load1.toFixed(2), "5m": +load5.toFixed(2), "15m": +load15.toFixed(2) },
load_status: loadStatus,
cpu_cores: coreCount,
memory: {
total_mb: Math.round(totalMem / 1024 / 1024),
used_mb: Math.round(usedMem / 1024 / 1024),
free_mb: Math.round(freeMem / 1024 / 1024),
used_pct: Math.round(usedMem / totalMem * 100),
},
disk: {
used_pct: diskUsedPct,
free_gb: diskFreeGb,
},
process_rss_mb: Math.round(process.memoryUsage().rss / 1024 / 1024),
},
database: { database: {
connected: true, connected: true,
latency_ms: latencyMs, latency_ms: latencyMs,

View File

@ -101,6 +101,8 @@ export async function registerSchedules(boss: PgBoss): Promise<void> {
"scrape:assets:switches", "scrape:assets:switches",
// ── Switch og:image fetcher (daily, after switch-assets) ────────── // ── Switch og:image fetcher (daily, after switch-assets) ──────────
"scrape:images:switches", "scrape:images:switches",
// ── Playwright image fetcher for bot-blocked vendors (every 3d) ───
"scrape:images:switches:playwright",
// ── eBay enrichment (every 6h) ──────────────────────────────────── // ── eBay enrichment (every 6h) ────────────────────────────────────
"enrich:ebay-transceivers", "enrich:ebay-transceivers",
"enrich:ebay-switches", "enrich:ebay-switches",
@ -241,6 +243,9 @@ export async function registerSchedules(boss: PgBoss): Promise<void> {
await boss.schedule("scrape:assets:switches", "30 7,19 * * *", {}, { retryLimit: 1, expireInSeconds: 3600 }); await boss.schedule("scrape:assets:switches", "30 7,19 * * *", {}, { retryLimit: 1, expireInSeconds: 3600 });
// og:image fetcher: daily at 08:30, after switch-assets completes at 07:30 // og:image fetcher: daily at 08:30, after switch-assets completes at 07:30
await boss.schedule("scrape:images:switches", "30 8 * * *", {}, { retryLimit: 1, expireInSeconds: 7200 }); await boss.schedule("scrape:images:switches", "30 8 * * *", {}, { retryLimit: 1, expireInSeconds: 7200 });
// Playwright image scraper for bot-blocked vendors (Arista/Dell/Edgecore/Fortinet/Extreme)
// Every 3 days at 09:00 — Playwright is slower and heavier than plain HTTP
await boss.schedule("scrape:images:switches:playwright", "0 9 */3 * *", {}, { retryLimit: 1, expireInSeconds: 10800 });
// ══════════════════════════════════════════════════════════════════════ // ══════════════════════════════════════════════════════════════════════
// EBAY ENRICHMENT — every 6h // EBAY ENRICHMENT — every 6h
@ -336,7 +341,8 @@ export async function registerWorkers(boss: PgBoss): Promise<void> {
const { scrapeUfiSpace } = await import("./scrapers/ufispace"); const { scrapeUfiSpace } = await import("./scrapers/ufispace");
const { scrapeEdgecore } = await import("./scrapers/edgecore"); const { scrapeEdgecore } = await import("./scrapers/edgecore");
const { scrapeSwitchAssets } = await import("./scrapers/switch-assets"); const { scrapeSwitchAssets } = await import("./scrapers/switch-assets");
const { fetchSwitchImages } = await import("./scrapers/switch-image-fetcher"); const { fetchSwitchImages } = await import("./scrapers/switch-image-fetcher");
const { fetchSwitchImagesPlaywright } = await import("./scrapers/switch-image-playwright");
const { scrapeFlexoptixCompatibility } = await import("./scrapers/flexoptix-compat"); const { scrapeFlexoptixCompatibility } = await import("./scrapers/flexoptix-compat");
// ── Prediction signal scrapers ──────────────────────────────────────── // ── Prediction signal scrapers ────────────────────────────────────────
const { scrapeSecEdgar } = await import("./scrapers/sec-edgar"); const { scrapeSecEdgar } = await import("./scrapers/sec-edgar");
@ -537,6 +543,15 @@ export async function registerWorkers(boss: PgBoss): Promise<void> {
await fetchSwitchImages(); await fetchSwitchImages();
}); });
await boss.work("scrape:images:switches:playwright", async () => {
console.log(`[${new Date().toISOString()}] Running: Switch image fetcher (Playwright — bot-blocked vendors)`);
if (!isLoadAcceptable(2.0)) {
console.warn(`[${new Date().toISOString()}] ⚠ Load too high — skipping Playwright image fetch`);
return;
}
await fetchSwitchImagesPlaywright();
});
// ── eBay enrichment ─────────────────────────────────────────────────── // ── eBay enrichment ───────────────────────────────────────────────────
await boss.work("enrich:ebay-transceivers", async () => { await boss.work("enrich:ebay-transceivers", async () => {

View File

@ -8,12 +8,14 @@
* 4. Write image_url + product_page_url to switches table * 4. Write image_url + product_page_url to switches table
* *
* Vendors covered: * Vendors covered:
* Cisco (Nexus 9000/9300, NCS 5500/5700, Catalyst 9300/9500) * Cisco (Nexus 9000/9300, NCS 5500/5700, Catalyst 9300/9500, 8000 SP)
* Arista (7000 series) * Arista (7000 series)
* Juniper (QFX, EX series) * Juniper (QFX, EX series)
* NVIDIA Networking (Spectrum SN series) * NVIDIA Networking (Spectrum SN series ConnectX skipped)
* Edgecore, Celestica, Asterfusion (whitebox) * Edgecore, Celestica, Asterfusion (whitebox)
* Fortinet (FortiSwitch series)
* Dell, HPE/Aruba, Huawei, Nokia, Extreme, MikroTik, Ubiquiti, FS.COM, Supermicro * Dell, HPE/Aruba, Huawei, Nokia, Extreme, MikroTik, Ubiquiti, FS.COM, Supermicro
* Alcatel-Lucent Enterprise, Allied Telesis, Netgear, Quanta Cloud Technology, Ufispace
* *
* Rate limit: 1 req/2sec per domain, max 3 concurrent domains. * Rate limit: 1 req/2sec per domain, max 3 concurrent domains.
* Respects robots.txt: User-Agent identifies as research bot. * Respects robots.txt: User-Agent identifies as research bot.
@ -50,11 +52,15 @@ function buildCiscoUrl(model: string): string | null {
const slug = m.toLowerCase().replace(/[^a-z0-9]/g, "-"); const slug = m.toLowerCase().replace(/[^a-z0-9]/g, "-");
return `https://www.cisco.com/c/en/us/products/switches/catalyst-${slug}/index.html`; return `https://www.cisco.com/c/en/us/products/switches/catalyst-${slug}/index.html`;
} }
// Cisco 8000 SP series: 8101-32FH, 8202-32FH, 8608 // Cisco 8000 SP series chassis: 8101-32FH, 8202-32FH, 8608
if (/^8[0-9]{3}/.test(m)) { if (/^8[0-9]{3}/.test(m)) {
const slug = m.toLowerCase().replace(/[^a-z0-9]/g, "-"); return `https://www.cisco.com/site/us/en/products/networking/sdwan-routers/8000-series/index.html`;
return `https://www.cisco.com/c/en/us/products/routers/8000-series-routers/${slug}/index.html`;
} }
// Cisco 8800 line cards (88-LC0-*, 84-MPA-*, 86-MPA-*) → same 8000 family page
if (/^(88|84|86)-/.test(m)) {
return `https://www.cisco.com/site/us/en/products/networking/sdwan-routers/8000-series/index.html`;
}
// ASR 9000 / A900 line cards only return the Cisco logo as og:image — skip
return null; return null;
} }
@ -89,7 +95,11 @@ function buildJuniperUrl(model: string): string | null {
function buildNvidiaUrl(model: string): string | null { function buildNvidiaUrl(model: string): string | null {
// SN5600 → https://www.nvidia.com/en-us/networking/ethernet-switching/sn5600/ // SN5600 → https://www.nvidia.com/en-us/networking/ethernet-switching/sn5600/
// SN4700 → https://www.nvidia.com/en-us/networking/ethernet-switching/sn4700/ // SN4700 → https://www.nvidia.com/en-us/networking/ethernet-switching/sn4700/
const slug = model.toUpperCase().replace(/[^A-Z0-9]/g, ""); // ConnectX-7 / BlueField are adapters, not switches — skip
const m = model.toUpperCase();
if (m.includes("CONNECTX") || m.includes("BLUEFIELD")) return null;
const slug = m.replace(/[^A-Z0-9]/g, "");
if (!slug.startsWith("SN")) return null; // only Spectrum switch series
return `https://www.nvidia.com/en-us/networking/ethernet-switching/${slug.toLowerCase()}/`; return `https://www.nvidia.com/en-us/networking/ethernet-switching/${slug.toLowerCase()}/`;
} }
@ -118,10 +128,25 @@ function buildExtremeUrl(model: string): string | null {
return `https://www.extremenetworks.com/product/${slug}/`; return `https://www.extremenetworks.com/product/${slug}/`;
} }
// MikroTik product URL slugs for models containing '+' are not derivable from
// the model name — their website uses opaque suffixes (_in, _rm, …).
// The models without '+' follow a simple pattern (lowercase, dashes→underscore).
const MIKROTIK_SLUG_MAP: Record<string, string> = {
"CRS305-1G-4S+": "crs305_1g_4s_in",
"CRS312-4C+8XG": "crs312_4c_8xg_rm",
"CRS317-1G-16S+": "crs317_1g_16s_rm",
"CRS326-24G-2S+": "crs326_24g_2s_in",
// CRS354-48G-4S+2Q+: URL not discoverable — MikroTik's product listing is JS-rendered
};
function buildMikroTikUrl(model: string): string | null { function buildMikroTikUrl(model: string): string | null {
// CRS504-4XQ-IN → https://mikrotik.com/product/CRS504_4XQ_IN if (model in MIKROTIK_SLUG_MAP) {
const slug = model.replace(/[-\s]+/g, "_"); return `https://mikrotik.com/product/${MIKROTIK_SLUG_MAP[model]}`;
return `https://mikrotik.com/product/${slug}`; }
if (model.includes("+")) return null; // other + models — URL unknown
// Simple lowercase + dashes→underscores for models without '+'
const slug = model.toLowerCase().replace(/[-\s]+/g, "_").replace(/[^a-z0-9_]/g, "");
return slug ? `https://mikrotik.com/product/${slug}` : null;
} }
function buildUbiquitiUrl(model: string): string | null { function buildUbiquitiUrl(model: string): string | null {
@ -154,28 +179,64 @@ function buildAsterfusionUrl(model: string): string | null {
return `https://www.asterfusion.com/products/${slug}/`; return `https://www.asterfusion.com/products/${slug}/`;
} }
function buildFortinetUrl(_model: string): string | null {
// Fortinet product pages are JS-rendered — og:image only returns the brand icon.
// All /products/fortiswitch/<model> URLs redirect to the generic /ethernet-switches page.
// Image scraping is not possible via plain HTTP for this vendor.
return null;
}
function buildQuantaUrl(model: string): string | null {
// QuantaMesh T3048-LY8, T7032-IX1 etc.
const slug = model.toLowerCase().replace(/\s+/g, "-").replace(/[^a-z0-9-]/g, "");
return `https://www.qct.io/product/index/Infrastructure-Product/Networking/Switch/${slug}`;
}
function buildAlliedTelesisUrl(model: string): string | null {
// AT-x530-28GSX → https://www.alliedtelesis.com/us/en/products/at-x530-28gsx
const slug = model.toLowerCase().replace(/\s+/g, "-").replace(/[^a-z0-9-]/g, "");
return `https://www.alliedtelesis.com/us/en/products/${slug}`;
}
function buildUfispaceUrl(model: string): string | null {
const slug = model.toLowerCase().replace(/\s+/g, "-").replace(/[^a-z0-9-]/g, "");
return `https://www.ufispace.com/products/${slug}`;
}
function buildNetgearUrl(model: string): string | null {
const slug = model.toLowerCase().replace(/\s+/g, "-").replace(/[^a-z0-9-]/g, "");
return `https://www.netgear.com/business/products/switches/${slug}`;
}
// ── URL dispatcher by vendor slug ─────────────────────────────────────────── // ── URL dispatcher by vendor slug ───────────────────────────────────────────
const URL_BUILDERS: Record<string, (m: string) => string | null> = { const URL_BUILDERS: Record<string, (m: string) => string | null> = {
cisco: buildCiscoUrl, cisco: buildCiscoUrl,
arista: buildAristaUrl, arista: buildAristaUrl,
juniper: buildJuniperUrl, juniper: buildJuniperUrl,
"nvidia-networking": buildNvidiaUrl, "nvidia-networking": buildNvidiaUrl,
edgecore: buildEdgecoreUrl, edgecore: buildEdgecoreUrl,
celestica: buildCelesticaUrl, celestica: buildCelesticaUrl,
asterfusion: buildAsterfusionUrl, asterfusion: buildAsterfusionUrl,
dell: buildDellUrl, fortinet: buildFortinetUrl,
"hpe-aruba": buildHpeArubaUrl, dell: buildDellUrl,
huawei: buildHuaweiUrl, "hpe-aruba": buildHpeArubaUrl,
nokia: buildNobelUrl, huawei: buildHuaweiUrl,
extreme: buildExtremeUrl, nokia: buildNobelUrl,
mikrotik: buildMikroTikUrl, extreme: buildExtremeUrl,
ubiquiti: buildUbiquitiUrl, mikrotik: buildMikroTikUrl,
"fs-com": buildFsComUrl, ubiquiti: buildUbiquitiUrl,
supermicro: buildSupermicroUrl, "fs-com": buildFsComUrl,
"alcatel-lucent": buildAlcatelLucentUrl, supermicro: buildSupermicroUrl,
"ale": buildAlcatelLucentUrl, "alcatel-lucent": buildAlcatelLucentUrl,
wistron: (_m) => null, // no public product pages "alcatel-lucent-enterprise": buildAlcatelLucentUrl, // fix: DB uses this slug
ale: buildAlcatelLucentUrl,
"quanta-cloud-technology": buildQuantaUrl,
"allied-telesis": buildAlliedTelesisUrl,
ufispace: buildUfispaceUrl,
netgear: buildNetgearUrl,
wistron: (_m) => null, // no public product pages
aruba: buildHpeArubaUrl, // alias
}; };
// ── Generic marketing image detector ──────────────────────────────────────── // ── Generic marketing image detector ────────────────────────────────────────
@ -219,6 +280,23 @@ const GENERIC_IMAGE_PATTERNS: RegExp[] = [
// ── Generic about/press/brand pages ────────────────────────────────────── // ── Generic about/press/brand pages ──────────────────────────────────────
/\/press[-_]kit/i, /\/press[-_]kit/i,
/\/media[-_]kit/i, /\/media[-_]kit/i,
// ── Vendor error / 404 graphics ──────────────────────────────────────────
/404[-_]error/i,
/error[-_]graphic/i,
// ── Navigation icon libraries ────────────────────────────────────────────
/\/icon[-_]library\//i,
// ── Diagrams and illustrations ───────────────────────────────────────────
/[-_]illustration[._]/i,
// ── Vendor 404 hero images ───────────────────────────────────────────────
/webimage-404/i,
// ── Moxa brand/marketing images (not product photos) ────────────────────
/\/Brand\//i,
/cybersecurity\.png/i,
// ── Cookie consent / GDPR overlay images ────────────────────────────────
/cdn\.cookielaw\.org/i,
/cookiebot\.com/i,
/trustarc\.com/i,
/consent-manager/i,
]; ];
function isGenericImage(url: string): boolean { function isGenericImage(url: string): boolean {

View File

@ -0,0 +1,432 @@
/**
* Switch Image Fetcher Playwright edition for bot-blocked vendors
*
* Vendors that reject plain HTTP bots (403/406) or require JS rendering:
* Arista (HTTP 406), Dell (HTTP 403), Edgecore (HTTP 403),
* Fortinet (JS-rendered), HPE/Aruba (HTTP 403), Extreme Networks (no static URLs),
* Nokia, Huawei, NVIDIA, Netgear, Ciena, Moxa, D-Link, Alcatel-Lucent Enterprise,
* Asterfusion, Brocade, UfiSpace, QCT
*
* Strategy:
* 1. Query switches without image_url for JS-blocked vendors
* 2. Open each product page in headless Chromium (stealth mode)
* 3. Extract og:image (or fallback: first large product <img>)
* 4. Apply same isGenericImage() filter as the plain HTTP fetcher
* 5. Write image_url + product_page_url to switches table
*
* Rate limit: maxConcurrency=1, 4s delay between requests.
* Run: npx tsx src/scrapers/switch-image-playwright.ts [--vendor=arista]
*/
import { PlaywrightCrawler } from "crawlee";
import { pool } from "../utils/db";
import { makeCrawleeConfig } from "../utils/crawlee-config";
// ── Stealth headers injected into every page ─────────────────────────────────
const STEALTH_UA = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36";
// ── Generic marketing image detector (mirrors switch-image-fetcher.ts) ────────
const GENERIC_IMAGE_PATTERNS: RegExp[] = [
/[-/_]logo[-_.]|\/logos?\//i,
/cisco[-_]?logo/i,
/juniper[-_]networks[-_]logo/i,
/arista[-_]?logo/i,
/brand[-_]?logo/i,
/company[-_]?logo/i,
/\/svg\//i,
/\.svg(\?|$)/i,
/naas-homepag/i,
/al-enterprise.*\/images\/naas/i,
/og[-_]default/i,
/default[-_](?:og|social|share|image)/i,
/site[-_](?:default|image|og)/i,
/social[-_](?:default|share)/i,
/twitter[-_]default/i,
/default[-_]thumbnail/i,
/\/homepage\//i,
/hero[-_](?:banner|bg|background|image)/i,
/banner[-_](?:bg|background)/i,
/lifestyle/i,
/stock[-_]?photo/i,
/placeholder/i,
/no[-_]?image/i,
/image[-_]?not[-_]?found/i,
/\/fallback[/-]/i,
/missing[-_]image/i,
/\/press[-_]kit/i,
/\/media[-_]kit/i,
// Vendor-specific brand icons
/open-graph\.gif/i,
/social[-_]icon/i,
/favicon/i,
/og[-_]image[-_][0-9]+x[0-9]+\./i, // e.g. og-image-1200x630 → family-level generic
// Cookie consent / GDPR overlay images (OneTrust, Cookiebot, TrustArc, etc.)
/cdn\.cookielaw\.org/i,
/cookiebot\.com/i,
/trustarc\.com/i,
/consent-manager/i,
// Nokia CMS marketing banners (not product photos)
/nok\d+-nokia-com-banner/i,
// Huawei category/why-buy marketing images
/whyhuawei-/i,
/campus-switches/i,
/bg_products/i,
// Generic "banners" path segment used by CMSes
/\/banners?\//i,
// Vendor error / 404 graphics
/404[-_]error/i,
/error[-_]graphic/i,
/webimage-404/i,
// Navigation icon libraries (D-Link, other CMSes)
/\/icon[-_]library\//i,
// Diagrams and illustrations (not product photos)
/[-_]illustration[._]/i,
// Moxa brand/marketing images (not product photos)
/\/Brand\//i,
/cybersecurity\.png/i,
];
function isGenericImage(url: string): boolean {
return GENERIC_IMAGE_PATTERNS.some((re) => re.test(url));
}
// ── Product page URL builders ─────────────────────────────────────────────────
function buildAristaUrl(model: string): string | null {
// Map model to its Arista series page (og:image lives on series pages, not individual model pages).
// Pattern: extract alphanumeric prefix before the first "-<digits>" port-count suffix.
// 7060X5-32QS → 7060x5 → /en/products/7060x5-series
// 7050CX3-32S → 7050cx3 → /en/products/7050cx3-series
// 7280R3A-48D5 → 7280r3a → strip trailing sub-variant 'A' → 7280r3 → /en/products/7280r3-series
// 7020R → 7020r → /en/products/7020r-series
const leadMatch = model.match(/^(\d{3,4}[A-Z0-9]*?)(-\d|$)/i);
if (!leadMatch) return null;
let series = leadMatch[1].toLowerCase();
// Strip trailing sub-variant 'a' (R3A → R3, R2A → R2) — Arista groups these on the base series page
series = series.replace(/([a-z]\d+)a$/, "$1");
return `https://www.arista.com/en/products/${series}-series`;
}
function buildDellUrl(model: string): string | null {
// PowerSwitch Z9332F-ON → try Dell networking product page
const cleanModel = model.replace(/^PowerSwitch\s+/i, "").trim();
const slug = cleanModel.toLowerCase().replace(/[^a-z0-9-]/g, "-");
return `https://www.dell.com/en-us/shop/dell-networking-switches/sc/networking-switches?appliedRefinements=DP_SEARCH_RESULTS_KEYWORDS~${encodeURIComponent(cleanModel)}`;
}
// Edgecore uses WooCommerce with /product/<slug>/ URLs (no .html suffix).
// Some models have non-obvious slugs verified via sitemap.
const EDGECORE_SLUG_MAP: Record<string, string> = {
"AS7712-32X": "as7712-32x-ec", // -ec suffix variant in Edgecore WooCommerce
"Minipack2": "minipack-as8000-open-modular-platform", // Facebook OCP Minipack2
};
function buildEdgecoreUrl(model: string): string | null {
if (model in EDGECORE_SLUG_MAP) {
return `https://www.edge-core.com/product/${EDGECORE_SLUG_MAP[model]}/`;
}
// Standard slug: lowercase, replace non-alphanum with dash, collapse multiple dashes
const slug = model.toLowerCase()
.replace(/[^a-z0-9-]/g, "-")
.replace(/-+/g, "-")
.replace(/^-|-$/g, "");
return slug ? `https://www.edge-core.com/product/${slug}/` : null;
}
function buildFortinetUrl(_model: string): string | null {
// Fortinet product pages are fully JS-rendered and all redirect to generic /products/ethernet-switches.
// No reliable og:image can be extracted — skip entirely.
return null;
}
function buildHpeArubaUrl(model: string): string | null {
// HPE Aruba series pages are stored in product_page_url for all known models.
// Builder is a fallback for unknown models.
const slug = model.toLowerCase().replace(/[^a-z0-9-]/g, "-");
return `https://www.arubanetworks.com/products/switches/${slug}/`;
}
function buildExtremeUrl(model: string): string | null {
// Extreme direct product pages: extremenetworks.com/product/<slug>
const slug = model.toLowerCase()
.replace(/\s+/g, "-")
.replace(/[^a-z0-9-]/g, "")
.replace(/-+/g, "-");
return slug ? `https://www.extremenetworks.com/product/${slug}` : null;
}
// ── New vendors (JS-rendered; rely on stored product_page_url or built URL) ────
// Nokia, Huawei, Ciena, Moxa, D-Link, ALE, Asterfusion, Brocade:
// all models have product_page_url in DB → return null so the stored URL is used.
const buildPassthroughUrl = (_model: string): string | null => null;
function buildNvidiaUrl(model: string): string | null {
// NVIDIA Spectrum switches: SN5600, SN4700, SN3700, SN3750-SX, SN2201, etc.
// ConnectX-7 is an HCA, no relevant product page → skip.
const snMatch = model.match(/^(SN[\d]+)/i);
if (snMatch) {
return `https://www.nvidia.com/en-us/networking/ethernet-switching/${snMatch[1].toLowerCase()}/`;
}
return null;
}
function buildNetgearUrl(model: string): string | null {
// M4300-96X, M4350-48G4XF, M4500-32C → /business/wired/switches/fully-managed/<slug>/
const slug = model.toLowerCase()
.replace(/[^a-z0-9]/g, "-")
.replace(/-+/g, "-")
.replace(/^-|-$/g, "");
return slug ? `https://www.netgear.com/business/wired/switches/fully-managed/${slug}/` : null;
}
// UfiSpace: slug map derived from sitemap (non-predictable product URL tree)
const UFISPACE_URL_MAP: Record<string, string> = {
"S9510-28DC": "https://www.ufispace.com/products/telco/access/s9510-28dc-flexe-tsn-disaggregated-cell-site-gateway",
"S9600-30DX": "https://www.ufispace.com/products/telco/aggregation/s9600-30dx-open-zr-aggregation-router",
"S9600-32X": "https://www.ufispace.com/products/telco/aggregation/s9600-32x-25g-100g-aggregation-router",
"S9600-72XC": "https://www.ufispace.com/products/telco/aggregation/s9600-72xc-25g-100g-open-aggregation-router-tcam",
"S9700-53DX": "https://www.ufispace.com/products/telco/core-edge/s9700-53dx-100g-core-router",
"S9710-76D": "https://www.ufispace.com/products/telco/core-edge/s9710-76d-high-density-400g-disaggregated-core-router",
};
function buildUfiSpaceUrl(model: string): string | null {
return UFISPACE_URL_MAP[model] ?? null;
}
// QCT: URL map derived from sitemap (category path not predictable from model name)
const QCT_URL_MAP: Record<string, string> = {
"QuantaMesh T3048-LY8": "https://www.qct.io/product/index/Networking/Ethernet-Switch/T3000-Series/QuantaMesh-T3048-LY8",
"QuantaMesh T7032-IX1": "https://www.qct.io/product/index/Networking/Bare-Metal-Switch/Spine-Switch/QuantaMesh-BMS-T7032-IX1",
};
function buildQctUrl(model: string): string | null {
return QCT_URL_MAP[model] ?? null;
}
const URL_BUILDERS: Record<string, (m: string) => string | null> = {
arista: buildAristaUrl,
dell: buildDellUrl,
edgecore: buildEdgecoreUrl,
fortinet: buildFortinetUrl,
"hpe-aruba": buildHpeArubaUrl,
extreme: buildExtremeUrl,
// New JS-rendered vendors (stored product_page_url used where available)
nokia: buildPassthroughUrl,
huawei: buildPassthroughUrl,
ciena: buildPassthroughUrl,
moxa: buildPassthroughUrl,
"d-link": buildPassthroughUrl,
"alcatel-lucent-enterprise": buildPassthroughUrl,
asterfusion: buildPassthroughUrl,
brocade: buildPassthroughUrl,
"nvidia-networking": buildNvidiaUrl,
netgear: buildNetgearUrl,
ufispace: buildUfiSpaceUrl,
"quanta-cloud-technology": buildQctUrl,
};
// ── Request data attached to each crawl URL ──────────────────────────────────
interface SwitchCrawlData {
switchId: string;
model: string;
vendorName: string;
vendorSlug: string;
productPageUrl: string;
}
// ── Main scraper ──────────────────────────────────────────────────────────────
export async function fetchSwitchImagesPlaywright(targetVendorSlug?: string): Promise<void> {
console.log("=== Switch Image Fetcher (Playwright) ===\n");
const slugFilter = targetVendorSlug ? `AND v.slug = '${targetVendorSlug}'` : `AND v.slug IN (${Object.keys(URL_BUILDERS).map((s) => `'${s}'`).join(",")})`;
const { rows } = await pool.query<{
id: string;
model: string;
vendor_slug: string;
vendor_name: string;
product_page_url: string | null;
}>(`
SELECT sw.id, sw.model, sw.product_page_url,
v.slug AS vendor_slug, v.name AS vendor_name
FROM switches sw
JOIN vendors v ON v.id = sw.vendor_id
WHERE (sw.image_url IS NULL OR sw.image_url = '')
${slugFilter}
ORDER BY v.slug, sw.model
`);
if (rows.length === 0) {
console.log(" All target switches already have images.\n");
return;
}
console.log(` ${rows.length} switches need images (Playwright vendors)\n`);
const requests: Array<{ url: string; uniqueKey: string; userData: SwitchCrawlData }> = [];
for (const row of rows) {
const builder = URL_BUILDERS[row.vendor_slug];
// For Arista: prefer freshly-built series URL over a stale stored model URL
const builtUrl = builder ? builder(row.model) : null;
const productUrl = row.vendor_slug === "arista"
? (builtUrl ?? row.product_page_url) // always use fresh series URL for Arista
: (row.product_page_url ?? builtUrl); // other vendors: prefer stored URL
if (!productUrl) {
console.log(` [SKIP] ${row.vendor_name} ${row.model} — no URL`);
continue;
}
requests.push({
url: productUrl,
// Use switch ID as uniqueKey so Crawlee doesn't deduplicate series-level URLs.
// Multiple models can share the same series page (e.g. 7060x5-series) — each needs its own DB write.
uniqueKey: row.id,
userData: {
switchId: row.id,
model: row.model,
vendorName: row.vendor_name,
vendorSlug: row.vendor_slug,
productPageUrl: productUrl,
},
});
}
if (requests.length === 0) {
console.log(" Nothing to crawl.\n");
return;
}
let found = 0;
let missed = 0;
let errors = 0;
const crawler = new PlaywrightCrawler(
{
maxConcurrency: 1, // one at a time — server-friendly
maxRequestsPerMinute: 12, // ~5s per request minimum
requestHandlerTimeoutSecs: 45,
navigationTimeoutSecs: 30,
headless: true,
launchContext: {
launchOptions: {
args: [
"--no-sandbox",
"--disable-setuid-sandbox",
"--disable-blink-features=AutomationControlled",
"--disable-infobars",
"--window-size=1920,1080",
],
},
},
preNavigationHooks: [
async (_ctx, gotoOptions) => {
gotoOptions!.waitUntil = "domcontentloaded";
},
],
async requestHandler({ request, page }) {
const data = request.userData as SwitchCrawlData;
// Inject stealth UA
await page.setExtraHTTPHeaders({
"Accept-Language": "en-US,en;q=0.9",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
});
await page.evaluate((ua) => {
Object.defineProperty(navigator, "userAgent", { value: ua, configurable: true });
Object.defineProperty(navigator, "webdriver", { value: false, configurable: true });
}, STEALTH_UA);
// Wait for page to settle (JS rendering)
await page.waitForLoadState("networkidle", { timeout: 20_000 }).catch(() => {});
// Extract og:image / twitter:image meta tags.
// We DON'T filter generics here — we filter outside so the img fallback can still run
// even when og:image exists but is a logo/brand image (e.g. Dell, HPE).
const metaImageUrl: string | null = await page.evaluate(() => {
const og = document.querySelector<HTMLMetaElement>('meta[property="og:image"]')?.content;
if (og) return og;
const tw = document.querySelector<HTMLMetaElement>('meta[name="twitter:image"]')?.content;
return tw ?? null;
});
// Use meta image if it passes the generic filter; otherwise fall through to img fallback.
let imageUrl: string | null = (metaImageUrl && !isGenericImage(metaImageUrl)) ? metaImageUrl : null;
if (!imageUrl) {
// Img fallback: largest visible image that isn't a UI element.
// Deliberately broad — isGenericImage() will filter hero/banner/logo images afterward.
imageUrl = await page.evaluate(() => {
const imgs = Array.from(document.querySelectorAll<HTMLImageElement>("img"));
const skipPattern = /\/flags?\/|\/icons?\/|\/avatars?\/|social[-_]icon|favicon|spinner|loading|cookielaw|cookiebot|trustarc/i;
const candidate = imgs
.filter((img) => {
const src = img.src || img.getAttribute("data-src") || "";
return src.startsWith("http") &&
/\.(jpg|jpeg|png|webp)/i.test(src) &&
img.naturalWidth >= 200 &&
img.naturalHeight >= 150 &&
!skipPattern.test(src);
})
.sort((a, b) => (b.naturalWidth * b.naturalHeight) - (a.naturalWidth * a.naturalHeight))[0];
return candidate?.src ?? null;
});
}
if (!imageUrl || isGenericImage(imageUrl)) {
console.log(` [MISS] ${data.vendorName} ${data.model} — no product image (${imageUrl?.slice(0, 60) ?? "null"})`);
missed++;
// Save product_page_url even on miss to track that we tried
if (!data.productPageUrl) {
await pool.query(
`UPDATE switches SET product_page_url = $2, assets_scraped_at = NOW() WHERE id = $1`,
[data.switchId, request.url],
);
}
return;
}
await pool.query(
`UPDATE switches
SET image_url = $2,
product_page_url = COALESCE(product_page_url, $3),
assets_scraped_at = NOW()
WHERE id = $1`,
[data.switchId, imageUrl, request.url],
);
console.log(` [OK] ${data.vendorName} ${data.model}${imageUrl.slice(0, 80)}`);
found++;
},
async failedRequestHandler({ request }) {
const data = request.userData as SwitchCrawlData;
console.log(` [FAIL] ${data.vendorName} ${data.model}${request.errorMessages?.[0] ?? "unknown error"}`);
errors++;
},
},
// Use a unique run ID to avoid Crawlee temp-dir state contamination when multiple
// vendor runs execute back-to-back (ENOENT: stale request-queue files from prior run).
makeCrawleeConfig(`switch-images-playwright-${Date.now()}`),
);
await crawler.run(requests);
console.log(`\n=== Playwright Image Scraper Complete ===`);
console.log(` Images found: ${found}`);
console.log(` Missed: ${missed}`);
if (errors > 0) console.warn(` Errors: ${errors}`);
}
if (require.main === module) {
const vendor = process.argv.find((a) => a.startsWith("--vendor="))?.split("=")[1];
fetchSwitchImagesPlaywright(vendor)
.then(() => pool.end())
.catch((err) => { console.error("Fatal:", err); pool.end(); process.exit(1); });
}

166
scripts/monitor-erik.sh Executable file
View File

@ -0,0 +1,166 @@
#!/bin/bash
# ─────────────────────────────────────────────────────────────────────────────
# TIP / Erik Health Monitor — run on Claudi (.82) or Raspberry Pi via cron
#
# Checks:
# 1. TIP API health endpoint (CPU load, memory, disk)
# 2. PM2 process status on Erik (errored/stopped processes)
# 3. DB query roundtrip latency
#
# Alerting:
# - ntfy.sh push notification (set NTFY_TOPIC below)
# - Appends to /var/log/tip-monitor.log
#
# Setup (Claudi / Pi):
# chmod +x monitor-erik.sh
# sudo cp monitor-erik.sh /usr/local/bin/tip-monitor
#
# Add to crontab (every 5 minutes):
# */5 * * * * /usr/local/bin/tip-monitor >> /var/log/tip-monitor.log 2>&1
#
# Or for systemd timer — see monitor-erik.service / monitor-erik.timer below
# ─────────────────────────────────────────────────────────────────────────────
set -euo pipefail
# ── Config ───────────────────────────────────────────────────────────────────
TIP_API="${TIP_API:-https://transceiver-db.context-x.org/api/health}"
NTFY_TOPIC="${NTFY_TOPIC:-}" # e.g. "tip-erik-alerts"
SSH_TARGET="${SSH_TARGET:-root@82.165.222.127}" # Erik IONOS direct (Claudi key authorized)
LOAD_WARN="${LOAD_WARN:-4.0}" # 1m load warning threshold
DISK_WARN="${DISK_WARN:-85}" # disk % warning threshold
MEM_WARN="${MEM_WARN:-90}" # memory % warning threshold
LOG_FILE="${LOG_FILE:-/var/log/tip-monitor.log}"
# ── Helpers ──────────────────────────────────────────────────────────────────
TS() { date '+%Y-%m-%d %H:%M:%S'; }
log() { echo "[$(TS)] $*"; }
warn() { echo "[$(TS)] ⚠️ WARN: $*"; }
crit() { echo "[$(TS)] 🔴 CRIT: $*"; alert "$*"; }
ALERTS=()
alert() {
ALERTS+=("$1")
if [[ -n "$NTFY_TOPIC" ]]; then
curl -s -m 5 \
-H "Title: TIP/Erik Alert" \
-H "Tags: warning,server" \
-H "Priority: urgent" \
-d "$1" \
"https://ntfy.sh/${NTFY_TOPIC}" > /dev/null 2>&1 || true
fi
}
# ── 1. TIP API health check ──────────────────────────────────────────────────
log "Checking TIP API health …"
HEALTH_JSON=""
HTTP_CODE=0
HTTP_CODE=$(curl -s -m 15 -o /tmp/tip-health.json -w "%{http_code}" "$TIP_API" 2>/dev/null || echo "0")
if [[ "$HTTP_CODE" != "200" ]]; then
crit "TIP API unreachable (HTTP $HTTP_CODE) — $TIP_API"
else
HEALTH_JSON=$(cat /tmp/tip-health.json 2>/dev/null || echo "{}")
# Extract fields (requires jq)
if command -v jq &>/dev/null; then
LOAD1=$(echo "$HEALTH_JSON" | jq -r '.system.load."1m" // "N/A"')
MEM_PCT=$(echo "$HEALTH_JSON" | jq -r '.system.memory.used_pct // "N/A"')
DISK_PCT=$(echo "$HEALTH_JSON" | jq -r '.system.disk.used_pct // "N/A"')
DISK_FREE=$(echo "$HEALTH_JSON"| jq -r '.system.disk.free_gb // "N/A"')
DB_LAT=$(echo "$HEALTH_JSON" | jq -r '.database.latency_ms // "N/A"')
STATUS=$(echo "$HEALTH_JSON" | jq -r '.status // "unknown"')
log " Status: $STATUS | Load: $LOAD1 | Mem: ${MEM_PCT}% | Disk: ${DISK_PCT}% (${DISK_FREE}GB free) | DB: ${DB_LAT}ms"
# Load check
if command -v bc &>/dev/null && [[ "$LOAD1" != "N/A" ]]; then
if (( $(echo "$LOAD1 > $LOAD_WARN" | bc -l) )); then
crit "High load on Erik: $LOAD1 (threshold $LOAD_WARN)"
fi
fi
# Memory check
if [[ "$MEM_PCT" != "N/A" ]] && [[ "$MEM_PCT" -ge "$MEM_WARN" ]]; then
crit "High memory usage on Erik: ${MEM_PCT}%"
fi
# Disk check
if [[ "$DISK_PCT" != "N/A" ]] && [[ "$DISK_PCT" -ge "$DISK_WARN" ]]; then
crit "Disk usage on Erik: ${DISK_PCT}% (${DISK_FREE}GB free)"
fi
# DB latency
if command -v bc &>/dev/null && [[ "$DB_LAT" != "N/A" ]]; then
if (( $(echo "$DB_LAT > 2000" | bc -l) )); then
warn "High DB latency: ${DB_LAT}ms"
fi
fi
else
log " API OK (HTTP 200) — install jq for detailed metrics"
fi
fi
# ── 2. PM2 process check (via SSH) ──────────────────────────────────────────
log "Checking PM2 processes on Erik …"
if ssh -o ConnectTimeout=10 -o BatchMode=yes "$SSH_TARGET" true 2>/dev/null; then
ERRORED=$(ssh -o ConnectTimeout=10 "$SSH_TARGET" \
"pm2 list --no-color 2>/dev/null | grep -E 'errored|stopped' | grep -v 'ecosystem-stable'" \
2>/dev/null || echo "")
if [[ -n "$ERRORED" ]]; then
COUNT=$(echo "$ERRORED" | wc -l | tr -d ' ')
crit "${COUNT} PM2 process(es) errored/stopped on Erik"
log " Errored: $ERRORED"
else
log " PM2: all processes running"
fi
# Check restart counts (> 5 in the last run = likely crashing)
HIGH_RESTARTS=$(ssh -o ConnectTimeout=10 "$SSH_TARGET" \
"pm2 list --no-color 2>/dev/null | awk 'NR>3 && \$16~/^[0-9]+$/ && \$16+0 > 5 {print \$2, \"restarts:\", \$16}'" \
2>/dev/null || echo "")
if [[ -n "$HIGH_RESTARTS" ]]; then
warn "High restart count: $HIGH_RESTARTS"
fi
else
crit "SSH connection to Erik failed (via $SSH_TARGET)"
fi
# ── 3. Summary ───────────────────────────────────────────────────────────────
if [[ ${#ALERTS[@]} -eq 0 ]]; then
log "✅ All checks passed"
else
log "🔴 ${#ALERTS[@]} alert(s) sent"
fi
# ── Optional: truncate log file at 5000 lines ────────────────────────────────
if [[ -f "$LOG_FILE" ]] && [[ $(wc -l < "$LOG_FILE") -gt 5000 ]]; then
tail -n 2500 "$LOG_FILE" > /tmp/tip-monitor-trim && mv /tmp/tip-monitor-trim "$LOG_FILE"
fi
# ── Systemd unit (paste into /etc/systemd/system/tip-monitor.service) ────────
# [Unit]
# Description=TIP/Erik Health Monitor
# After=network.target
#
# [Service]
# Type=oneshot
# ExecStart=/usr/local/bin/tip-monitor
# StandardOutput=append:/var/log/tip-monitor.log
# StandardError=append:/var/log/tip-monitor.log
# Environment=NTFY_TOPIC=tip-erik-alerts
#
# [Install]
# WantedBy=multi-user.target
#
# Systemd timer (paste into /etc/systemd/system/tip-monitor.timer):
# [Unit]
# Description=Run TIP monitor every 5 minutes
# [Timer]
# OnBootSec=60
# OnUnitActiveSec=300
# [Install]
# WantedBy=timers.target

View File

@ -0,0 +1,24 @@
-- Migration 044 — Fix Cisco 8000 series product_page_url
--
-- The previous scraper run incorrectly stored the NCS-5500 series URL
-- for Cisco 8000-series SP router models (8101-32FH, 8202-32FH, etc).
-- The correct page is the 8000-series family page on Cisco's new /site/ URL scheme.
--
-- After this migration, the image scraper will re-fetch these 35 switches
-- using the updated buildCiscoUrl() which now returns the correct family URL.
-- 1. Clear the wrongly-stored NCS-5500 product_page_url so the scraper rebuilds it
UPDATE switches
SET product_page_url = NULL,
assets_scraped_at = NULL
WHERE product_page_url = 'https://www.cisco.com/c/en/us/products/routers/network-convergence-system-5500-series/index.html'
AND image_url IS NULL;
-- 2. Pre-set the correct 8000-series family URL for all 8000-series models without an image
-- so the next scraper run hits the right page immediately
UPDATE switches
SET product_page_url = 'https://www.cisco.com/site/us/en/products/networking/sdwan-routers/8000-series/index.html',
assets_scraped_at = NULL
WHERE image_url IS NULL
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'cisco')
AND model ~ '^8[0-9]{3}';

View File

@ -0,0 +1,43 @@
-- Migration 045 — Edgecore product images (direct URL injection)
--
-- Edgecore blocks headless browsers (Playwright gets 403) but serves og:image
-- from their WooCommerce site via plain HTTP. The AS7xxx enterprise switches
-- (7535, 7726, 7946, 9516) are not listed on edge-core.com at all.
--
-- Source: og:image extracted with curl from each /product/<slug>/ page.
-- Images verified as actual product photos (not logos / generic).
UPDATE switches
SET image_url = 'https://www.edge-core.com/wp-content/uploads/2023/08/DCS204-A.png',
product_page_url = COALESCE(product_page_url, 'https://www.edge-core.com/product/dcs204/'),
assets_scraped_at = NOW()
WHERE model = 'DCS204'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'edgecore');
UPDATE switches
SET image_url = 'https://www.edge-core.com/wp-content/uploads/2023/08/DCS510-A.png',
product_page_url = COALESCE(product_page_url, 'https://www.edge-core.com/product/dcs510/'),
assets_scraped_at = NOW()
WHERE model = 'DCS510'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'edgecore');
UPDATE switches
SET image_url = 'https://www.edge-core.com/wp-content/uploads/2023/08/dcs810-A.png',
product_page_url = COALESCE(product_page_url, 'https://www.edge-core.com/product/dcs810/'),
assets_scraped_at = NOW()
WHERE model = 'DCS810'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'edgecore');
UPDATE switches
SET image_url = 'https://www.edge-core.com/wp-content/uploads/2023/08/EPS203-A.png',
product_page_url = COALESCE(product_page_url, 'https://www.edge-core.com/product/eps203/'),
assets_scraped_at = NOW()
WHERE model = 'EPS203'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'edgecore');
UPDATE switches
SET image_url = 'https://www.edge-core.com/wp-content/uploads/2023/08/AS8000-A.png',
product_page_url = COALESCE(product_page_url, 'https://www.edge-core.com/product/minipack-as8000-open-modular-platform/'),
assets_scraped_at = NOW()
WHERE model = 'Minipack2'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'edgecore');

View File

@ -0,0 +1,14 @@
-- Migration 046 — QCT / Quanta Cloud Technology product images (direct URL injection)
--
-- QCT T3048-LY8: og:image accessible via plain HTTP on qct.io
-- URL: qct.io/product/index/Networking/Ethernet-Switch/T3000-Series/QuantaMesh-T3048-LY8
-- og:image: qct.io/upload/website/product/gallery/normal/NetworkSwitch-QuantaMesh-T3048-LY8_FrontView02-740x460_16111812291.png
--
-- T7032-IX1 and T7064-*/T9032-* have no accessible og:image (JS-rendered or no dedicated page).
UPDATE switches
SET image_url = 'https://www.qct.io/upload/website/product/gallery/normal/NetworkSwitch-QuantaMesh-T3048-LY8_FrontView02-740x460_16111812291.png',
product_page_url = COALESCE(product_page_url, 'https://www.qct.io/product/index/Networking/Ethernet-Switch/T3000-Series/QuantaMesh-T3048-LY8'),
assets_scraped_at = NOW()
WHERE model = 'QuantaMesh T3048-LY8'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'quanta-cloud-technology');

View File

@ -0,0 +1,42 @@
-- Migration 047 — Moxa product images (direct CDN URL injection)
--
-- CDN base: cdn-cms-frontdoor-dfc8ebanh6bkb3hs.a02.azurefd.net
-- Path pattern: /en/getattachment/Products/INDUSTRIAL-NETWORK-INFRASTRUCTURE/...
--
-- ⚠️ Hotlink-protected: CDN requires Referer: https://www.moxa.com/
-- Images will not display directly from third-party domains.
-- Use Cloudflare Worker proxy or download to R2 for production display.
--
-- All URLs verified HTTP 200 with correct Referer (2026-04-21).
-- EDS-518E — Layer-2 Managed Switch (8 + 2-port)
UPDATE switches
SET image_url = 'https://cdn-cms-frontdoor-dfc8ebanh6bkb3hs.a02.azurefd.net/en/getattachment/Products/INDUSTRIAL-NETWORK-INFRASTRUCTURE/Ethernet-Switches/Layer-2-Managed-Switches/EDS-518E-Series/moxa-eds-518e-series-image-1-(1).jpg',
product_page_url = COALESCE(product_page_url, 'https://www.moxa.com/en/products/industrial-network-infrastructure/ethernet-switches/layer-2-managed-switches/eds-518e-series'),
assets_scraped_at = NOW()
WHERE model = 'EDS-518E'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'moxa');
-- EDS-G4014 — Layer-2 Managed Switch (14-port Gigabit)
UPDATE switches
SET image_url = 'https://cdn-cms-frontdoor-dfc8ebanh6bkb3hs.a02.azurefd.net/en/getattachment/Products/INDUSTRIAL-NETWORK-INFRASTRUCTURE/Ethernet-Switches/Layer-2-Managed-Switches/EDS-G4014-Series/moxa-eds-g4014-series-image-(1).jpg',
product_page_url = COALESCE(product_page_url, 'https://www.moxa.com/en/products/industrial-network-infrastructure/ethernet-switches/layer-2-managed-switches/eds-g4014-series'),
assets_scraped_at = NOW()
WHERE model = 'EDS-G4014'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'moxa');
-- ICS-G7826A — Rackmount Industrial Managed Switch (26-port)
UPDATE switches
SET image_url = 'https://cdn-cms-frontdoor-dfc8ebanh6bkb3hs.a02.azurefd.net/en/getattachment/Products/INDUSTRIAL-NETWORK-INFRASTRUCTURE/Ethernet-Switches/Rackmount-Switches/ICS-G7826A-Series/moxa-ics-g7826a-series-image-(1).jpg',
product_page_url = COALESCE(product_page_url, 'https://www.moxa.com/en/products/industrial-network-infrastructure/ethernet-switches/rackmount-switches/ics-g7826a-series'),
assets_scraped_at = NOW()
WHERE model = 'ICS-G7826A'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'moxa');
-- IKS-G6824A — Rackmount Industrial Managed Switch (24-port)
UPDATE switches
SET image_url = 'https://cdn-cms-frontdoor-dfc8ebanh6bkb3hs.a02.azurefd.net/en/getattachment/Products/INDUSTRIAL-NETWORK-INFRASTRUCTURE/Ethernet-Switches/Rackmount-Switches/IKS-G6824A-Series/moxa-iks-g6824a-series-image-(1).jpg',
product_page_url = COALESCE(product_page_url, 'https://www.moxa.com/en/products/industrial-network-infrastructure/ethernet-switches/rackmount-switches/iks-g6824a-series'),
assets_scraped_at = NOW()
WHERE model = 'IKS-G6824A'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'moxa');

View File

@ -0,0 +1,66 @@
-- Migration 048 — UfiSpace and Brocade product images (direct URL injection)
--
-- UfiSpace: images served from ufispace.com/image/<hash>/<filename>
-- No og:image meta tags — images extracted from product page carousels.
-- All URLs verified HTTP 200 (2026-04-21).
--
-- Brocade G720/G730: og:image from broadcom.com (acquired Brocade FC networking).
-- ICX 7850-48FS: acquired by CommScope/Ruckus — image URL has rotating session
-- token, not stable; skipped.
-- ── UfiSpace ─────────────────────────────────────────────────────────────────
UPDATE switches
SET image_url = 'https://www.ufispace.com/image/5R/9510-28DC-front-2026.png',
assets_scraped_at = NOW()
WHERE model = 'S9510-28DC'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'ufispace');
UPDATE switches
SET image_url = 'https://www.ufispace.com/image/2n/3475633079f4f0df148772926dd278c9.png',
assets_scraped_at = NOW()
WHERE model = 'S9600-30DX'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'ufispace');
UPDATE switches
SET image_url = 'https://www.ufispace.com/image/24/5dd78db3fb82420f59d164e35131b476.png',
assets_scraped_at = NOW()
WHERE model = 'S9600-32X'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'ufispace');
UPDATE switches
SET image_url = 'https://www.ufispace.com/image/2D/9b12bdf9033020045872a3e55132d7b9.png',
assets_scraped_at = NOW()
WHERE model = 'S9600-72XC'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'ufispace');
UPDATE switches
SET image_url = 'https://www.ufispace.com/image/x/f0edea5710efc9ce351b742b222f03d1.png',
assets_scraped_at = NOW()
WHERE model = 'S9700-53DX'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'ufispace');
UPDATE switches
SET image_url = 'https://www.ufispace.com/image/2V/aa3e530e3555baacedbc6e603c1fc331.png',
assets_scraped_at = NOW()
WHERE model = 'S9710-76D'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'ufispace');
-- ── Brocade (via Broadcom) ────────────────────────────────────────────────────
UPDATE switches
SET image_url = 'https://www.broadcom.com/media/blt4ac44e0e6c6d8341/bltf8d09763812cf984/604f5eb61078bc20548c0494/g720-right_283_29.jpeg',
product_page_url = COALESCE(product_page_url, 'https://www.broadcom.com/products/fibre-channel-networking/switches/g720-switch'),
assets_scraped_at = NOW()
WHERE model = 'G720'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'brocade');
UPDATE switches
SET image_url = 'https://www.broadcom.com/media/blt4ac44e0e6c6d8341/blt1d11847b97f678d0/62030d79d6534f0c057188c3/Brocade_G730_Left.jpeg',
product_page_url = COALESCE(product_page_url, 'https://www.broadcom.com/products/fibre-channel-networking/switches/g730-switch'),
assets_scraped_at = NOW()
WHERE model = 'G730'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'brocade');
-- ICX 7850-48FS: now CommScope/Ruckus — image URL uses rotating session token,
-- not stable enough to store. Left as NULL pending a stable image source.

View File

@ -0,0 +1,56 @@
-- Migration 049 — NVIDIA Networking (Spectrum) switch product images
--
-- Sources:
-- SN2201, SN3700, SN4700: docscontent.nvidia.com (official NVIDIA docs CDN,
-- backed by k3-prod-nvidia-docs.s3.us-west-2.amazonaws.com)
-- SN3750-SX: cdn.uvation.com (reseller CDN — no official NVIDIA front-view photo)
-- SN5400, SN5600: direct S3 from k3-prod-nvidia-docs (SN5000 hardware manual)
--
-- All URLs verified HTTP 200 image/png (2026-04-21).
-- SN2201 — Spectrum-1, 1GbE management switch
UPDATE switches
SET image_url = 'https://docscontent.nvidia.com/dims4/default/0ed212d/2147483647/strip/true/crop/1487x152+0+0/resize/1440x147!/quality/90/?url=https%3A%2F%2Fk3-prod-nvidia-docs.s3.us-west-2.amazonaws.com%2Fbrightspot%2Fconfluence%2F0000019a-2ff0-da13-abfe-bffbc48b0000%2Fimages%2Fdownload%2Fattachments%2F4232636769%2Fimage2021-12-7_10-37-52-version-1-modificationdate-1756395299567-api-v2.png',
product_page_url = COALESCE(product_page_url, 'https://marketplace.nvidia.com/en-us/enterprise/networking/sn2201/'),
assets_scraped_at = NOW()
WHERE model = 'SN2201'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'nvidia-networking');
-- SN3700 — Spectrum-2, 32x100GbE
UPDATE switches
SET image_url = 'https://docscontent.nvidia.com/dims4/default/3be2526/2147483647/strip/true/crop/1333x142+0+0/resize/1333x142!/quality/90/?url=https%3A%2F%2Fk3-prod-nvidia-docs.s3.us-west-2.amazonaws.com%2Fbrightspot%2Fconfluence%2F0000019a-4e93-d062-adbe-ce933de80000%2Fimages%2Fdownload%2Fattachments%2F4413914428%2Fimage2019-2-25_11-38-47-version-1-modificationdate-1761741936620-api-v2.png',
product_page_url = COALESCE(product_page_url, 'https://marketplace.nvidia.com/en-us/enterprise/networking/sn3700/'),
assets_scraped_at = NOW()
WHERE model = 'SN3700'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'nvidia-networking');
-- SN3750-SX — Spectrum-2, 32x200GbE (reseller CDN; no official NVIDIA front photo)
UPDATE switches
SET image_url = 'https://cdn.uvation.com/uvationmarketplace/catalog/product/m/s/msn3750-vs2fsc_1.jpg',
assets_scraped_at = NOW()
WHERE model = 'SN3750-SX'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'nvidia-networking');
-- SN4700 — Spectrum-3, 32x400GbE
UPDATE switches
SET image_url = 'https://docscontent.nvidia.com/dims4/default/019a2aa/2147483647/strip/true/crop/1791x188+0+0/resize/1440x151!/quality/90/?url=https%3A%2F%2Fk3-prod-nvidia-docs.s3.us-west-2.amazonaws.com%2Fbrightspot%2Fconfluence%2F0000019d-86b0-ddad-a3bf-eff5dced0000%2Fimages%2Fdownload%2Fattachments%2F4794381944%2Fimage2020-5-3_12-15-57-version-1-modificationdate-1775996206557-api-v2.png',
product_page_url = COALESCE(product_page_url, 'https://marketplace.nvidia.com/en-us/enterprise/networking/sn4700/'),
assets_scraped_at = NOW()
WHERE model = 'SN4700'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'nvidia-networking');
-- SN5400 — Spectrum-4, 64x400GbE, 2U
UPDATE switches
SET image_url = 'https://k3-prod-nvidia-docs.s3.us-west-2.amazonaws.com/brightspot/confluence/0000019d-1a8d-dcc0-a39f-dacdabb80000/images/download/attachments/2705811518/image-2025-2-9_11-39-27-version-1-modificationdate-1744286748050-api-v2.png',
product_page_url = COALESCE(product_page_url, 'https://www.nvidia.com/en-us/networking/spectrumx/'),
assets_scraped_at = NOW()
WHERE model = 'SN5400'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'nvidia-networking');
-- SN5600 — Spectrum-4, 64x800GbE, 2U
UPDATE switches
SET image_url = 'https://k3-prod-nvidia-docs.s3.us-west-2.amazonaws.com/brightspot/confluence/0000019d-1a8d-dcc0-a39f-dacdabb80000/images/download/attachments/2705811518/image-2025-2-9_11-37-20-version-1-modificationdate-1744286748283-api-v2.png',
product_page_url = COALESCE(product_page_url, 'https://www.nvidia.com/en-us/networking/spectrumx/'),
assets_scraped_at = NOW()
WHERE model = 'SN5600'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'nvidia-networking');

View File

@ -0,0 +1,28 @@
-- Migration 050 — Allied Telesis product images (direct URL injection)
--
-- Source: alliedtelesis.com og:image (Drupal CMS, static files CDN)
-- All URLs verified HTTP 200 image/png (2026-04-21).
-- AT-x530-28GSX — x530 Series Gigabit PoE+ Smart Access Switch
UPDATE switches
SET image_url = 'https://www.alliedtelesis.com/sites/default/files/image/2022-07/x530-series-3840.png',
product_page_url = COALESCE(product_page_url, 'https://www.alliedtelesis.com/products/switches/x530-series'),
assets_scraped_at = NOW()
WHERE model = 'AT-x530-28GSX'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'allied-telesis');
-- AT-x530L-52GPX — x530L Series PoE+ Access Switch
UPDATE switches
SET image_url = 'https://www.alliedtelesis.com/sites/default/files/image/2021-11/x530L-series-3840.png',
product_page_url = COALESCE(product_page_url, 'https://www.alliedtelesis.com/products/switches/x530l-series'),
assets_scraped_at = NOW()
WHERE model = 'AT-x530L-52GPX'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'allied-telesis');
-- AT-x950-28XSQ — x950 Series Aggregation Switch
UPDATE switches
SET image_url = 'https://www.alliedtelesis.com/sites/default/files/image/2021-11/x950-28-52-3840.png',
product_page_url = COALESCE(product_page_url, 'https://www.alliedtelesis.com/products/switches/x950-series'),
assets_scraped_at = NOW()
WHERE model = 'AT-x950-28XSQ'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'allied-telesis');

View File

@ -0,0 +1,21 @@
-- Migration 051 — TP-LINK product images (direct URL injection)
--
-- Source: tp-link.com og:image (static CDN static.tp-link.com)
-- URL pattern: /upload/image-line/{MODEL}_{REGION}_{HW}_F_large_{TIMESTAMP}.jpg
-- All URLs verified HTTP 200 image/jpeg (2026-04-21).
-- TL-SG3452XP — 48-Port Gigabit PoE+ Smart Switch
UPDATE switches
SET image_url = 'https://static.tp-link.com/upload/image-line/TL-SG3452XP_UN_1.0_F_large_20211223063015g.jpg',
product_page_url = COALESCE(product_page_url, 'https://www.tp-link.com/en/business-networking/managed-switch/tl-sg3452xp/'),
assets_scraped_at = NOW()
WHERE model = 'TL-SG3452XP'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'tp-link');
-- TL-SX3016F — 16-Port 10GE SFP+ Smart Switch
UPDATE switches
SET image_url = 'https://static.tp-link.com/upload/image-line/TL-SX3016F_UN_1.0_01_large_20210924060640m.jpg',
product_page_url = COALESCE(product_page_url, 'https://www.tp-link.com/en/business-networking/managed-switch/tl-sx3016f/'),
assets_scraped_at = NOW()
WHERE model = 'TL-SX3016F'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'tp-link');

View File

@ -0,0 +1,60 @@
-- Migration 052 — Nokia product images (direct URL injection)
--
-- Sources:
-- 7220 IXR-D3L: documentation.nokia.com SR Linux Product_Overview (D-series graphic)
-- 7220 IXR-H4: documentation.nokia.com SR Linux Product_Overview (H2/H3 composite)
-- 7250 IXR-10: tempestns.com (Tempest Telecom Solutions reseller CDN)
-- 7750 SR-1: tempestns.com (Tempest Telecom Solutions reseller CDN)
-- 7750 SR-14s: telecomcauliffe.com (reseller CDN)
-- 7750 SR-1e: documentation.nokia.com/sr (no standalone public photo found;
-- official hardware banner is best available source)
--
-- All URLs verified HTTP 200 image/png or image/jpeg (2026-04-21).
-- 7220 IXR-D3L — SR Linux 48×25G leaf switch
UPDATE switches
SET image_url = 'https://documentation.nokia.com/srlinux/SR_Linux_HTML_R21-11/Product_Overview/graphics/DL.png',
product_page_url = COALESCE(product_page_url, 'https://www.nokia.com/data-center-networks/data-center-fabric/7220-interconnect-router/'),
assets_scraped_at = NOW()
WHERE model = '7220 IXR-D3L'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'nokia');
-- 7220 IXR-H4 — SR Linux 32×400G spine (H2/H3 composite is closest public image)
UPDATE switches
SET image_url = 'https://documentation.nokia.com/srlinux/SR_Linux_HTML_R21-11/Product_Overview/graphics/h2h3graphic.png',
product_page_url = COALESCE(product_page_url, 'https://www.nokia.com/data-center-networks/data-center-fabric/7220-interconnect-router/'),
assets_scraped_at = NOW()
WHERE model = '7220 IXR-H4'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'nokia');
-- 7250 IXR-10 — Spine/IXP router (model-specific reseller image)
UPDATE switches
SET image_url = 'https://www.tempestns.com/wp-content/uploads/2020/05/nokia-7250_IXR-10.jpg',
product_page_url = COALESCE(product_page_url, 'https://documentation.nokia.com/ixr/7250-IXR/index.html'),
assets_scraped_at = NOW()
WHERE model = '7250 IXR-10'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'nokia');
-- 7750 SR-1 — Metro edge service router
UPDATE switches
SET image_url = 'https://www.tempestns.com/wp-content/uploads/2020/05/nokia-7750SR-1.jpg',
product_page_url = COALESCE(product_page_url, 'https://www.nokia.com/networks/products/7750-service-router/'),
assets_scraped_at = NOW()
WHERE model = '7750 SR-1'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'nokia');
-- 7750 SR-14s — Modular core SP router
UPDATE switches
SET image_url = 'https://telecomcauliffe.com/wp-content/uploads/2025/02/Nokia_7750_SR_14s_TM.png',
product_page_url = COALESCE(product_page_url, 'https://www.nokia.com/networks/products/7750-service-router/'),
assets_scraped_at = NOW()
WHERE model = '7750 SR-14s'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'nokia');
-- 7750 SR-1e — Compact SP router (no standalone public image; official docs banner used)
UPDATE switches
SET image_url = 'https://documentation.nokia.com/sr/7750-SR/resources/hardwareBanner.png',
product_page_url = COALESCE(product_page_url, 'https://www.nokia.com/networks/products/7750-service-router/'),
assets_scraped_at = NOW()
WHERE model = '7750 SR-1e'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'nokia');

View File

@ -0,0 +1,31 @@
-- Migration 053 — F5 Networks BIG-IP product images (direct URL injection)
--
-- Sources:
-- i5800, i10800: wtit.com (IT reseller CDN; model-specific PNG filenames)
-- i15800: cdn.blueally.com (BlueAlly CDN; "i15000-series" composite)
--
-- All URLs verified HTTP 200 image/png (2026-04-21).
-- BIG-IP i5800 — 4×10G SFP+ application delivery controller
UPDATE switches
SET image_url = 'https://wtit.com/wp-content/uploads/2016/11/big-ip-i5800.png',
product_page_url = COALESCE(product_page_url, 'https://www.f5.com/products/big-ip-services'),
assets_scraped_at = NOW()
WHERE model = 'BIG-IP i5800'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'f5-networks');
-- BIG-IP i10800 — 4×10G SFP+ higher-throughput ADC
UPDATE switches
SET image_url = 'https://wtit.com/wp-content/uploads/2016/11/big-ip-i10800.png',
product_page_url = COALESCE(product_page_url, 'https://www.f5.com/products/big-ip-services'),
assets_scraped_at = NOW()
WHERE model = 'BIG-IP i10800'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'f5-networks');
-- BIG-IP i15800 — 4×40G QSFP+ top-of-range ADC
UPDATE switches
SET image_url = 'https://cdn.blueally.com/appdeliveryworks/images/hardware/big-ip-iseries/bigip-i15000-series.png',
product_page_url = COALESCE(product_page_url, 'https://www.f5.com/products/big-ip-services'),
assets_scraped_at = NOW()
WHERE model = 'BIG-IP i15800'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'f5-networks');

View File

@ -0,0 +1,74 @@
-- Migration 054 — Delta Networks and Siemens SCALANCE product images (direct URL injection)
--
-- Delta Networks: hardwarenation.com reseller CDN (model-specific JPEGs).
-- AG9064v2: static-data2.manualslib.com (only public clear photo found).
--
-- Siemens SCALANCE: images.sw.cdn.siemens.com (official Siemens DISW CDN).
-- og:image pattern: scalance-x-{series}-product-og-1200x630.jpg
-- X-200 → XC216-4C, X-300 → XR324-12M, X-500 → XM416-4C + XR528-6M.
--
-- All URLs verified HTTP 200 image/jpeg or image/png (2026-04-21).
-- ── Delta Networks ─────────────────────────────────────────────────────────────
-- AG5648 — 48×25G whitebox SONiC switch
UPDATE switches
SET image_url = 'https://hardwarenation.com/wp-content/uploads/2021/06/AG5648V1-1.jpg',
assets_scraped_at = NOW()
WHERE model = 'AG5648'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'delta-networks');
-- AG9032v2A — 32×100G open networking switch
UPDATE switches
SET image_url = 'https://hardwarenation.com/wp-content/uploads/2021/06/AG9032V2.jpg',
assets_scraped_at = NOW()
WHERE model = 'AG9032v2A'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'delta-networks');
-- AG9064v2 — 64×100G open networking switch
UPDATE switches
SET image_url = 'https://static-data2.manualslib.com/product-images/320/1426327/delta-ag9064-switch.jpg',
assets_scraped_at = NOW()
WHERE model = 'AG9064v2'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'delta-networks');
-- AGC7648A — 48×25G + 6×100G switch
UPDATE switches
SET image_url = 'https://hardwarenation.com/wp-content/uploads/2021/06/AGC7648A-Front04.jpg',
assets_scraped_at = NOW()
WHERE model = 'AGC7648A'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'delta-networks');
-- ── Siemens SCALANCE ───────────────────────────────────────────────────────────
-- SCALANCE XC216-4C — X-200 series compact industrial switch
UPDATE switches
SET image_url = 'https://images.sw.cdn.siemens.com/siemens-disw-assets/public/1eXzT56kfEjsXjYV3SPjgV/en-US/scalance-x-200-product-og-1200x630.jpg',
product_page_url = COALESCE(product_page_url, 'https://www.siemens.com/global/en/products/automation/industrial-communication/industrial-ethernet/scalance-x.html'),
assets_scraped_at = NOW()
WHERE model = 'SCALANCE XC216-4C'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'siemens');
-- SCALANCE XM416-4C — X-500 series modular managed switch
UPDATE switches
SET image_url = 'https://images.sw.cdn.siemens.com/siemens-disw-assets/public/25K1qjDD4NJhsMZDGrxt4P/en-US/scalance-x-500-product-og-1200x630.jpg',
product_page_url = COALESCE(product_page_url, 'https://www.siemens.com/global/en/products/automation/industrial-communication/industrial-ethernet/scalance-x.html'),
assets_scraped_at = NOW()
WHERE model = 'SCALANCE XM416-4C'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'siemens');
-- SCALANCE XR324-12M — X-300 series rackmount managed switch
UPDATE switches
SET image_url = 'https://images.sw.cdn.siemens.com/siemens-disw-assets/public/7pf0XdeuiBDdwgEg3Iq8al/en-US/scalance-x-300-product-og-1200x630.jpg',
product_page_url = COALESCE(product_page_url, 'https://www.siemens.com/global/en/products/automation/industrial-communication/industrial-ethernet/scalance-x.html'),
assets_scraped_at = NOW()
WHERE model = 'SCALANCE XR324-12M'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'siemens');
-- SCALANCE XR528-6M — X-500 series high-port rackmount switch (same series image as XM416-4C)
UPDATE switches
SET image_url = 'https://images.sw.cdn.siemens.com/siemens-disw-assets/public/25K1qjDD4NJhsMZDGrxt4P/en-US/scalance-x-500-product-og-1200x630.jpg',
product_page_url = COALESCE(product_page_url, 'https://www.siemens.com/global/en/products/automation/industrial-communication/industrial-ethernet/scalance-x.html'),
assets_scraped_at = NOW()
WHERE model = 'SCALANCE XR528-6M'
AND vendor_id = (SELECT id FROM vendors WHERE slug = 'siemens');