feat(scraper): Flexoptix REST API sync robot + scheduler integration
Replaces the GraphQL/search-based Flexoptix scraper with a proper
Magento 2 REST API integration that delivers authoritative SKUs,
prices, stock levels and compatibility data.
New files:
- packages/scraper/src/robots/flexoptix-api-sync.ts
Self-contained robot: auth → paginated fetch → normalize → DB write.
Reads FLEXOPTIX_API_BASE_URL / _USERNAME / _PASSWORD from env.
Returns { fetched, normalized, skipped, priceWrites, stockWrites }.
No file intermediary — in-memory pipeline.
- scripts/import-flexoptix-catalog.ts
One-shot CLI importer for the Pulso-generated JSONL (Codex handover).
- docs/FLEXOPTIX_CATALOG_IMPORT.md
Runbook for manual import + per-SKU specifications enrichment.
Scheduler changes:
- Added sync:flexoptix-catalog queue + work() handler
- Scheduled every 2h at 0 */2 * * * (same cadence as legacy job)
- scrape:pricing:flexoptix kept as legacy GraphQL fallback
Also includes Codex-generated additions from this sprint:
- audiocodes-oem scraper, seed-batch35/36/37, db.ts improvements,
sql/102 verification reconcile, README + package.json updates
This commit is contained in:
parent
2b16551e4f
commit
a20094755d
13
README.md
13
README.md
@ -127,6 +127,19 @@ All data comes from publicly available sources:
|
|||||||
- Multi-Source Agreements (100G CWDM4 MSA, 100G PSM4 MSA, 100G Lambda MSA, OpenZR+)
|
- Multi-Source Agreements (100G CWDM4 MSA, 100G PSM4 MSA, 100G Lambda MSA, OpenZR+)
|
||||||
- Vendor datasheets and public documentation
|
- Vendor datasheets and public documentation
|
||||||
|
|
||||||
|
## Flexoptix Catalog Import
|
||||||
|
|
||||||
|
Private TIP deployments can import the normalized Flexoptix shop catalog produced
|
||||||
|
by Magatama/Pulso:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
npm run flexoptix:catalog:import -- --dry-run
|
||||||
|
TIP_DB_PASS=... npm run flexoptix:catalog:import
|
||||||
|
```
|
||||||
|
|
||||||
|
See [docs/FLEXOPTIX_CATALOG_IMPORT.md](docs/FLEXOPTIX_CATALOG_IMPORT.md) for the
|
||||||
|
full producer/import workflow and safety rules.
|
||||||
|
|
||||||
## Contributing
|
## Contributing
|
||||||
|
|
||||||
Contributions welcome. To add a new transceiver:
|
Contributions welcome. To add a new transceiver:
|
||||||
|
|||||||
49
docs/FLEXOPTIX_CATALOG_IMPORT.md
Normal file
49
docs/FLEXOPTIX_CATALOG_IMPORT.md
Normal file
@ -0,0 +1,49 @@
|
|||||||
|
# Flexoptix Catalog Import
|
||||||
|
|
||||||
|
TIP can ingest the normalized Flexoptix shop catalog produced by Magatama/Pulso.
|
||||||
|
This keeps prices, product URLs, stock, and product attributes live-data driven
|
||||||
|
instead of baking volatile values into LLM training pairs.
|
||||||
|
|
||||||
|
## Producer
|
||||||
|
|
||||||
|
Run in the Magatama repo:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pnpm pulso:shop:sync:dry-run
|
||||||
|
pnpm pulso:shop:sync -- --fixture ./shop-export.json
|
||||||
|
FLEXOPTIX_API_BASE_URL=... FLEXOPTIX_API_TOKEN=... pnpm pulso:shop:sync
|
||||||
|
```
|
||||||
|
|
||||||
|
The producer writes:
|
||||||
|
|
||||||
|
- `data/pulso/shop-products.normalized.jsonl`
|
||||||
|
- `data/pulso/shop-catalog.normalized.json`
|
||||||
|
- `data/pulso/shop-snapshot.manifest.json`
|
||||||
|
|
||||||
|
## TIP Import
|
||||||
|
|
||||||
|
Safe validation:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
npm run flexoptix:catalog:import -- --dry-run
|
||||||
|
```
|
||||||
|
|
||||||
|
Import into the TIP database:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
TIP_DB_PASS=... npm run flexoptix:catalog:import
|
||||||
|
```
|
||||||
|
|
||||||
|
Override the input path if needed:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
npm run flexoptix:catalog:import -- --input /path/shop-products.normalized.jsonl --dry-run
|
||||||
|
```
|
||||||
|
|
||||||
|
## Safety Rules
|
||||||
|
|
||||||
|
- Rows without SKU, title, form factor, speed, or reach are skipped for DB import.
|
||||||
|
- Prices and stock are written only when present in live/API data.
|
||||||
|
- No product URL, SKU, price, stock, delivery time, or compatibility is invented.
|
||||||
|
- The importer uses existing TIP helpers for transceiver upsert, price observations,
|
||||||
|
and stock observations.
|
||||||
@ -13,6 +13,7 @@
|
|||||||
"dev": "npm run dev -w packages/api",
|
"dev": "npm run dev -w packages/api",
|
||||||
"learning-pool:build": "tsx scripts/tip-learning-pool-build.ts",
|
"learning-pool:build": "tsx scripts/tip-learning-pool-build.ts",
|
||||||
"learning-pool:publish-hf": "python3 scripts/tip-publish-hf-datasets.py",
|
"learning-pool:publish-hf": "python3 scripts/tip-publish-hf-datasets.py",
|
||||||
|
"flexoptix:catalog:import": "tsx scripts/import-flexoptix-catalog.ts",
|
||||||
"migrate": "tsx scripts/migrate.ts",
|
"migrate": "tsx scripts/migrate.ts",
|
||||||
"seed": "tsx scripts/seed-from-npm.ts",
|
"seed": "tsx scripts/seed-from-npm.ts",
|
||||||
"db:reset": "npm run migrate && npm run seed"
|
"db:reset": "npm run migrate && npm run seed"
|
||||||
|
|||||||
531
packages/scraper/src/robots/flexoptix-api-sync.ts
Normal file
531
packages/scraper/src/robots/flexoptix-api-sync.ts
Normal file
@ -0,0 +1,531 @@
|
|||||||
|
/**
|
||||||
|
* Flexoptix API Sync Robot
|
||||||
|
*
|
||||||
|
* Fetches the full Flexoptix product catalog via their Magento 2 REST API,
|
||||||
|
* normalizes each product, and writes price + stock observations to TIP DB.
|
||||||
|
*
|
||||||
|
* Replaces the GraphQL/search-based scrapeFlexoptixCatalog() — the REST API
|
||||||
|
* provides authoritative SKUs, prices, stock levels, and compatibility data.
|
||||||
|
*
|
||||||
|
* Required env vars:
|
||||||
|
* FLEXOPTIX_API_BASE_URL e.g. https://www.flexoptix.net
|
||||||
|
* FLEXOPTIX_API_USERNAME Magento customer API username
|
||||||
|
* FLEXOPTIX_API_PASSWORD Magento customer API password
|
||||||
|
*
|
||||||
|
* Optional:
|
||||||
|
* FLEXOPTIX_API_TOKEN Bearer token (skips username/password login)
|
||||||
|
* FLEXOPTIX_API_CURRENCY Defaults to EUR
|
||||||
|
* FLEXOPTIX_API_LIMIT Page size, defaults to 500
|
||||||
|
* FLEXOPTIX_API_TIMEOUT_MS Request timeout, defaults to 30000
|
||||||
|
* FLEXOPTIX_API_COMPATIBILITIES 1/0, defaults to 1
|
||||||
|
*/
|
||||||
|
|
||||||
|
import {
|
||||||
|
ensureVendor,
|
||||||
|
findOrCreateScrapedTransceiver,
|
||||||
|
upsertPriceObservation,
|
||||||
|
upsertStockObservation,
|
||||||
|
} from "../utils/db";
|
||||||
|
import { contentHash } from "../utils/hash";
|
||||||
|
|
||||||
|
// ── Types ──────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
type JsonRecord = Record<string, unknown>;
|
||||||
|
|
||||||
|
interface CatalogProduct {
|
||||||
|
source: "flexoptix-shop-api";
|
||||||
|
fetchedAt: string;
|
||||||
|
sku: string;
|
||||||
|
title: string;
|
||||||
|
url: string | null;
|
||||||
|
price: {
|
||||||
|
amount: number | null;
|
||||||
|
currency: string | null;
|
||||||
|
source: "api" | "missing";
|
||||||
|
fetchedAt: string;
|
||||||
|
};
|
||||||
|
stock: {
|
||||||
|
status: string | null;
|
||||||
|
quantity: number | null;
|
||||||
|
source: "api" | "missing";
|
||||||
|
};
|
||||||
|
optics: {
|
||||||
|
formFactor: string | null;
|
||||||
|
speedGbps: number | null;
|
||||||
|
reachM: number | null;
|
||||||
|
wavelengthNm: number | null;
|
||||||
|
connector: string | null;
|
||||||
|
fiberType: string | null;
|
||||||
|
protocol: string | null;
|
||||||
|
coding: string | null;
|
||||||
|
bidi: boolean | null;
|
||||||
|
dwdm: boolean | null;
|
||||||
|
cwdm: boolean | null;
|
||||||
|
};
|
||||||
|
compatibility: Array<{
|
||||||
|
vendor: string;
|
||||||
|
platform: string | null;
|
||||||
|
coding: string | null;
|
||||||
|
source: "api";
|
||||||
|
}>;
|
||||||
|
}
|
||||||
|
|
||||||
|
export interface FlexoptixSyncResult {
|
||||||
|
fetched: number;
|
||||||
|
normalized: number;
|
||||||
|
skipped: number;
|
||||||
|
priceWrites: number;
|
||||||
|
stockWrites: number;
|
||||||
|
}
|
||||||
|
|
||||||
|
// ── Generic helpers ────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
function isRecord(value: unknown): value is JsonRecord {
|
||||||
|
return typeof value === "object" && value !== null && !Array.isArray(value);
|
||||||
|
}
|
||||||
|
|
||||||
|
function flatten(value: unknown, prefix = "", output: JsonRecord = {}): JsonRecord {
|
||||||
|
if (Array.isArray(value)) {
|
||||||
|
output[prefix] = value;
|
||||||
|
return output;
|
||||||
|
}
|
||||||
|
if (!isRecord(value)) {
|
||||||
|
output[prefix] = value;
|
||||||
|
return output;
|
||||||
|
}
|
||||||
|
for (const [key, child] of Object.entries(value)) {
|
||||||
|
const childKey = prefix ? `${prefix}.${key}` : key;
|
||||||
|
if (isRecord(child)) {
|
||||||
|
flatten(child, childKey, output);
|
||||||
|
} else {
|
||||||
|
output[childKey] = child;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return output;
|
||||||
|
}
|
||||||
|
|
||||||
|
function flatLookup(record: JsonRecord): JsonRecord {
|
||||||
|
const flat = flatten(record);
|
||||||
|
const normalized: JsonRecord = {};
|
||||||
|
for (const [key, value] of Object.entries(flat)) {
|
||||||
|
normalized[key.toLowerCase().replace(/[_\s-]+/g, "")] = value;
|
||||||
|
}
|
||||||
|
return normalized;
|
||||||
|
}
|
||||||
|
|
||||||
|
function pick(flat: JsonRecord, keys: string[]): unknown {
|
||||||
|
const normalizedKeys = keys.map(key => key.toLowerCase().replace(/[_\s-]+/g, ""));
|
||||||
|
for (const key of normalizedKeys) {
|
||||||
|
if (flat[key] !== undefined && flat[key] !== null && flat[key] !== "") return flat[key];
|
||||||
|
}
|
||||||
|
for (const [flatKey, value] of Object.entries(flat)) {
|
||||||
|
if (value === undefined || value === null || value === "") continue;
|
||||||
|
if (normalizedKeys.some(key => flatKey.endsWith(key))) return value;
|
||||||
|
}
|
||||||
|
return null;
|
||||||
|
}
|
||||||
|
|
||||||
|
function asString(value: unknown): string | null {
|
||||||
|
if (typeof value === "string") return value.trim() || null;
|
||||||
|
if (typeof value === "number" || typeof value === "boolean") return String(value);
|
||||||
|
return null;
|
||||||
|
}
|
||||||
|
|
||||||
|
function asNumber(value: unknown): number | null {
|
||||||
|
if (typeof value === "number" && Number.isFinite(value)) return value;
|
||||||
|
if (typeof value !== "string") return null;
|
||||||
|
const trimmed = value.trim();
|
||||||
|
if (!trimmed) return null;
|
||||||
|
const stripped = trimmed.replace(/[^\d,.-]/g, "");
|
||||||
|
const normalized = stripped.includes(",") && !stripped.includes(".")
|
||||||
|
? stripped.replace(",", ".")
|
||||||
|
: stripped.replace(/,/g, "");
|
||||||
|
const parsed = Number(normalized);
|
||||||
|
return Number.isFinite(parsed) ? parsed : null;
|
||||||
|
}
|
||||||
|
|
||||||
|
function asBoolean(value: unknown): boolean | null {
|
||||||
|
if (typeof value === "boolean") return value;
|
||||||
|
if (typeof value !== "string") return null;
|
||||||
|
const normalized = value.trim().toLowerCase();
|
||||||
|
if (["true", "yes", "ja", "1"].includes(normalized)) return true;
|
||||||
|
if (["false", "no", "nein", "0"].includes(normalized)) return false;
|
||||||
|
return null;
|
||||||
|
}
|
||||||
|
|
||||||
|
// ── Inference helpers ──────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
function inferBoolFlags(
|
||||||
|
...values: Array<string | null>
|
||||||
|
): { bidi: boolean | null; dwdm: boolean | null; cwdm: boolean | null } {
|
||||||
|
const text = values.filter(Boolean).join(" ").toLowerCase();
|
||||||
|
return {
|
||||||
|
bidi: /\bbidi\b|bi-directional|bidirectional|simplex/.test(text) ? true : null,
|
||||||
|
dwdm: /\bdwdm\b|itu channel|itu grid/.test(text) ? true : null,
|
||||||
|
cwdm: /\bcwdm\b/.test(text) ? true : null,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
function parseReachMeters(value: unknown): number | null {
|
||||||
|
const text = asString(value)?.toLowerCase();
|
||||||
|
if (text) {
|
||||||
|
const km = text.match(/([\d.,]+)\s*km/);
|
||||||
|
if (km) return Math.round((asNumber(km[1]) ?? 0) * 1000);
|
||||||
|
const meters = text.match(/([\d.,]+)\s*m/);
|
||||||
|
if (meters) return Math.round(asNumber(meters[1]) ?? 0);
|
||||||
|
}
|
||||||
|
return asNumber(value);
|
||||||
|
}
|
||||||
|
|
||||||
|
function parseSpeedGbps(value: unknown): number | null {
|
||||||
|
const numeric = asNumber(value);
|
||||||
|
if (numeric !== null) return numeric;
|
||||||
|
const text = asString(value)?.toLowerCase();
|
||||||
|
if (!text) return null;
|
||||||
|
const gbps = text.match(/([\d.,]+)\s*(g|gb|gbps|gbit)/);
|
||||||
|
if (gbps) return asNumber(gbps[1]);
|
||||||
|
const mbps = text.match(/([\d.,]+)\s*(m|mb|mbps|mbit)/);
|
||||||
|
if (mbps) {
|
||||||
|
const mb = asNumber(mbps[1]);
|
||||||
|
return mb === null ? null : mb / 1000;
|
||||||
|
}
|
||||||
|
return null;
|
||||||
|
}
|
||||||
|
|
||||||
|
function inferFormFactor(...values: Array<string | null>): string | null {
|
||||||
|
const text = values.filter(Boolean).join(" ").toUpperCase();
|
||||||
|
const formFactors = [
|
||||||
|
"QSFP-DD800", "QSFP-DD", "QSFP112", "QSFP56", "QSFP28", "QSFP+", "QSFP",
|
||||||
|
"OSFP224", "OSFP112", "OSFP",
|
||||||
|
"SFP56", "SFP28", "SFP+", "SFP",
|
||||||
|
"XFP", "CFP4", "CFP2", "CFP", "GBIC",
|
||||||
|
];
|
||||||
|
return formFactors.find(ff => text.includes(ff)) ?? null;
|
||||||
|
}
|
||||||
|
|
||||||
|
function inferConnector(...values: Array<string | null>): string | null {
|
||||||
|
const text = values.filter(Boolean).join(" ").toUpperCase();
|
||||||
|
if (text.includes("LC-DUPLEX") || text.includes("LC DUPLEX")) return "LC-Duplex";
|
||||||
|
if (text.includes("LC SIMPLEX")) return "LC-Simplex";
|
||||||
|
if (/\bMPO\b|\bMTP\b/.test(text)) return "MPO/MTP";
|
||||||
|
if (/\bRJ45\b|COPPER/.test(text)) return "RJ45";
|
||||||
|
if (/\bSC\b/.test(text)) return "SC";
|
||||||
|
return null;
|
||||||
|
}
|
||||||
|
|
||||||
|
function inferFiberType(...values: Array<string | null>): string | null {
|
||||||
|
const text = values.filter(Boolean).join(" ").toLowerCase();
|
||||||
|
if (/multimode|\bmmf\b|om[1-5]/.test(text)) return "MMF";
|
||||||
|
if (/singlemode|single-mode|\bsmf\b|os2/.test(text)) return "SMF";
|
||||||
|
if (/copper|rj45|dac/.test(text)) return "Copper";
|
||||||
|
return null;
|
||||||
|
}
|
||||||
|
|
||||||
|
function inferWavelengthNm(...values: Array<string | null>): number | null {
|
||||||
|
const text = values.filter(Boolean).join(" ");
|
||||||
|
const lambda = text.match(/[λλ]?\s*(\d{3,4})\s*nm/i);
|
||||||
|
return lambda ? asNumber(lambda[1]) : null;
|
||||||
|
}
|
||||||
|
|
||||||
|
// ── Normalization ──────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
function extractCompatibility(row: JsonRecord): CatalogProduct["compatibility"] {
|
||||||
|
const rawCompat = row.compatibility ?? row.compatibilities ?? row.vendorCompatibility;
|
||||||
|
const rows = Array.isArray(rawCompat) ? rawCompat.filter(isRecord) : [];
|
||||||
|
return rows.flatMap(entry => {
|
||||||
|
const flat = flatLookup(entry);
|
||||||
|
const vendor = asString(pick(flat, ["vendor", "manufacturer", "brand", "systemVendor"]));
|
||||||
|
if (!vendor) return [];
|
||||||
|
return [{
|
||||||
|
vendor,
|
||||||
|
platform: asString(pick(flat, ["platform", "device", "switch", "model", "series"])),
|
||||||
|
coding: asString(pick(flat, ["coding", "code", "eeprom", "originalPartNumber"])),
|
||||||
|
source: "api" as const,
|
||||||
|
}];
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
function normalizeProduct(row: JsonRecord, fetchedAt: string): CatalogProduct | null {
|
||||||
|
const flat = flatLookup(row);
|
||||||
|
const title = asString(pick(flat, ["title", "name", "productName", "label"]));
|
||||||
|
const sku = asString(pick(flat, ["sku", "articleNumber", "partNumber", "productCode"]));
|
||||||
|
if (!sku || !title) return null;
|
||||||
|
|
||||||
|
const url = asString(pick(flat, ["url", "productUrl", "canonicalUrl", "link"]));
|
||||||
|
const amount = asNumber(pick(flat, ["price", "priceNet", "netPrice", "grossPrice", "amount"]));
|
||||||
|
const currency = asString(pick(flat, ["currency", "priceCurrency", "currencyCode"]))
|
||||||
|
?? (amount === null ? null : process.env["FLEXOPTIX_API_CURRENCY"]?.trim() ?? "EUR");
|
||||||
|
const quantity = asNumber(pick(flat, ["stock", "stockQuantity", "quantity", "availableQuantity"]));
|
||||||
|
const status = asString(pick(flat, ["stockStatus", "availability", "deliveryStatus"]))
|
||||||
|
?? (quantity === null ? null : quantity > 0 ? "in_stock" : "out_of_stock");
|
||||||
|
|
||||||
|
const formFactor = asString(pick(flat, ["formFactor", "form", "moduleType", "transceiverType"]))
|
||||||
|
?? inferFormFactor(title);
|
||||||
|
const protocol = asString(pick(flat, ["protocol", "standard", "ethernetStandard"]));
|
||||||
|
const connector = asString(pick(flat, ["connector", "interface", "mediaConnector"])) ?? inferConnector(title);
|
||||||
|
const fiberType = asString(pick(flat, ["fiberType", "fiber", "mode"])) ?? inferFiberType(title);
|
||||||
|
const coding = asString(pick(flat, ["coding", "vendorCoding"]));
|
||||||
|
const wavelength = asNumber(pick(flat, ["wavelengthNm", "wavelength", "lambda"]))
|
||||||
|
?? inferWavelengthNm(title);
|
||||||
|
const flags = inferBoolFlags(title, protocol, formFactor, fiberType);
|
||||||
|
|
||||||
|
return {
|
||||||
|
source: "flexoptix-shop-api",
|
||||||
|
fetchedAt,
|
||||||
|
sku,
|
||||||
|
title,
|
||||||
|
url,
|
||||||
|
price: {
|
||||||
|
amount,
|
||||||
|
currency,
|
||||||
|
source: amount === null ? "missing" : "api",
|
||||||
|
fetchedAt,
|
||||||
|
},
|
||||||
|
stock: {
|
||||||
|
status,
|
||||||
|
quantity,
|
||||||
|
source: status === null && quantity === null ? "missing" : "api",
|
||||||
|
},
|
||||||
|
optics: {
|
||||||
|
formFactor,
|
||||||
|
speedGbps: parseSpeedGbps(pick(flat, ["speedGbps", "speed", "rate", "dataRate"]) ?? title),
|
||||||
|
reachM: parseReachMeters(pick(flat, ["reachM", "reach", "distance", "transmissionDistance"]) ?? title),
|
||||||
|
wavelengthNm: wavelength,
|
||||||
|
connector,
|
||||||
|
fiberType,
|
||||||
|
protocol,
|
||||||
|
coding,
|
||||||
|
bidi: asBoolean(pick(flat, ["bidi", "bidirectional"])) ?? flags.bidi,
|
||||||
|
dwdm: asBoolean(pick(flat, ["dwdm"])) ?? flags.dwdm,
|
||||||
|
cwdm: asBoolean(pick(flat, ["cwdm"])) ?? flags.cwdm,
|
||||||
|
},
|
||||||
|
compatibility: extractCompatibility(row),
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
// ── Import helpers ─────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
function canImportProduct(product: CatalogProduct): boolean {
|
||||||
|
return Boolean(
|
||||||
|
product.sku
|
||||||
|
&& product.title
|
||||||
|
&& product.optics.formFactor
|
||||||
|
&& product.optics.speedGbps !== null
|
||||||
|
&& product.optics.reachM !== null,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
function reachLabel(reachM: number | null): string | undefined {
|
||||||
|
if (reachM === null) return undefined;
|
||||||
|
if (reachM >= 1000 && reachM % 1000 === 0) return `${reachM / 1000}km`;
|
||||||
|
return `${reachM}m`;
|
||||||
|
}
|
||||||
|
|
||||||
|
function speedLabel(speedGbps: number | null): string | undefined {
|
||||||
|
if (speedGbps === null) return undefined;
|
||||||
|
if (speedGbps >= 1000) return `${speedGbps / 1000}T`;
|
||||||
|
return `${speedGbps}G`;
|
||||||
|
}
|
||||||
|
|
||||||
|
function categoryFor(product: CatalogProduct): string {
|
||||||
|
const text = `${product.title} ${product.optics.protocol ?? ""}`.toLowerCase();
|
||||||
|
if (/\bdac\b|direct attach|copper/.test(text)) return "DAC";
|
||||||
|
if (/\baoc\b|active optical/.test(text)) return "AOC";
|
||||||
|
if (/coherent|zr|dco/.test(text)) return "Coherent";
|
||||||
|
return "DataCenter";
|
||||||
|
}
|
||||||
|
|
||||||
|
async function importProduct(
|
||||||
|
product: CatalogProduct,
|
||||||
|
vendorId: string,
|
||||||
|
): Promise<{ priceWritten: boolean; stockWritten: boolean }> {
|
||||||
|
const transceiverId = await findOrCreateScrapedTransceiver({
|
||||||
|
partNumber: product.sku,
|
||||||
|
vendorId,
|
||||||
|
productUrl: product.url ?? undefined,
|
||||||
|
formFactor: product.optics.formFactor ?? undefined,
|
||||||
|
speedGbps: product.optics.speedGbps ?? undefined,
|
||||||
|
speed: speedLabel(product.optics.speedGbps),
|
||||||
|
reachMeters: product.optics.reachM ?? undefined,
|
||||||
|
reachLabel: reachLabel(product.optics.reachM),
|
||||||
|
fiberType: product.optics.fiberType ?? undefined,
|
||||||
|
wavelengths: product.optics.wavelengthNm === null ? undefined : `${product.optics.wavelengthNm}nm`,
|
||||||
|
category: categoryFor(product),
|
||||||
|
});
|
||||||
|
|
||||||
|
let priceWritten = false;
|
||||||
|
if (product.price.amount !== null && product.price.currency) {
|
||||||
|
priceWritten = await upsertPriceObservation({
|
||||||
|
transceiverId,
|
||||||
|
sourceVendorId: vendorId,
|
||||||
|
price: product.price.amount,
|
||||||
|
currency: product.price.currency,
|
||||||
|
stockLevel: product.stock.status ?? "unknown",
|
||||||
|
quantityAvailable: product.stock.quantity ?? undefined,
|
||||||
|
url: product.url ?? undefined,
|
||||||
|
contentHash: contentHash({
|
||||||
|
source: product.source,
|
||||||
|
sku: product.sku,
|
||||||
|
price: product.price.amount,
|
||||||
|
currency: product.price.currency,
|
||||||
|
fetchedAt: product.price.fetchedAt,
|
||||||
|
}),
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
const stockWritten = await upsertStockObservation({
|
||||||
|
transceiverId,
|
||||||
|
sourceVendorId: vendorId,
|
||||||
|
stockLevel: product.stock.status ?? "unknown",
|
||||||
|
quantityAvailable: product.stock.quantity ?? undefined,
|
||||||
|
priceNet: product.price.amount ?? undefined,
|
||||||
|
productUrl: product.url ?? undefined,
|
||||||
|
priceCurrency: product.price.currency ?? undefined,
|
||||||
|
stockConfidence: product.stock.quantity === null ? 1 : 2,
|
||||||
|
});
|
||||||
|
|
||||||
|
return { priceWritten, stockWritten };
|
||||||
|
}
|
||||||
|
|
||||||
|
// ── API client ─────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
function validateEnv(): { baseUrl: string; username: string | null; password: string | null; token: string | null } {
|
||||||
|
const baseUrl = process.env["FLEXOPTIX_API_BASE_URL"]?.trim();
|
||||||
|
if (!baseUrl) {
|
||||||
|
throw new Error("FLEXOPTIX_API_BASE_URL is required for Flexoptix API sync");
|
||||||
|
}
|
||||||
|
const token = process.env["FLEXOPTIX_API_TOKEN"]?.trim() ?? null;
|
||||||
|
const username = process.env["FLEXOPTIX_API_USERNAME"]?.trim() ?? null;
|
||||||
|
const password = process.env["FLEXOPTIX_API_PASSWORD"]?.trim() ?? null;
|
||||||
|
if (!token && (!username || !password)) {
|
||||||
|
throw new Error("FLEXOPTIX_API_TOKEN or FLEXOPTIX_API_USERNAME+PASSWORD required");
|
||||||
|
}
|
||||||
|
return { baseUrl, username, password, token };
|
||||||
|
}
|
||||||
|
|
||||||
|
function buildUrl(baseUrl: string, path: string): URL {
|
||||||
|
const base = baseUrl.endsWith("/") ? baseUrl : `${baseUrl}/`;
|
||||||
|
return new URL(path, base);
|
||||||
|
}
|
||||||
|
|
||||||
|
async function fetchJson(url: URL, init: RequestInit, timeoutMs: number): Promise<unknown> {
|
||||||
|
const controller = new AbortController();
|
||||||
|
const timer = setTimeout(() => controller.abort(), Number.isFinite(timeoutMs) ? timeoutMs : 30_000);
|
||||||
|
try {
|
||||||
|
const response = await fetch(url, { ...init, signal: controller.signal });
|
||||||
|
if (!response.ok) {
|
||||||
|
throw new Error(`HTTP ${response.status} from ${url.origin}${url.pathname}`);
|
||||||
|
}
|
||||||
|
return response.json();
|
||||||
|
} finally {
|
||||||
|
clearTimeout(timer);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
async function getBearerToken(baseUrl: string, username: string, password: string, timeoutMs: number): Promise<string> {
|
||||||
|
const authPath = process.env["FLEXOPTIX_API_AUTH_PATH"]?.trim() ?? "/rest/V1/integration/customer/token";
|
||||||
|
const url = buildUrl(baseUrl, authPath);
|
||||||
|
const payload = await fetchJson(url, {
|
||||||
|
method: "POST",
|
||||||
|
headers: { accept: "application/json", "content-type": "application/json" },
|
||||||
|
body: JSON.stringify({ username, password }),
|
||||||
|
}, timeoutMs);
|
||||||
|
if (typeof payload !== "string") {
|
||||||
|
throw new Error("Flexoptix auth response was not a string token");
|
||||||
|
}
|
||||||
|
return payload;
|
||||||
|
}
|
||||||
|
|
||||||
|
async function fetchAllProducts(baseUrl: string, headers: Record<string, string>, timeoutMs: number): Promise<JsonRecord[]> {
|
||||||
|
const productPath = process.env["FLEXOPTIX_API_PRODUCTS_PATH"]?.trim() ?? "/rest/V2/flexoptix/products";
|
||||||
|
const compatibilities = process.env["FLEXOPTIX_API_COMPATIBILITIES"]?.trim() ?? "1";
|
||||||
|
const limit = parseInt(process.env["FLEXOPTIX_API_LIMIT"]?.trim() ?? "500", 10);
|
||||||
|
const currency = process.env["FLEXOPTIX_API_CURRENCY"]?.trim() ?? "EUR";
|
||||||
|
|
||||||
|
const allRows: JsonRecord[] = [];
|
||||||
|
|
||||||
|
for (let page = 1; ; page++) {
|
||||||
|
const url = buildUrl(baseUrl, productPath);
|
||||||
|
url.searchParams.set("currency", currency);
|
||||||
|
url.searchParams.set("page", String(page));
|
||||||
|
url.searchParams.set("limit", String(Number.isFinite(limit) ? limit : 500));
|
||||||
|
if (compatibilities === "1") url.searchParams.set("compatibilities", "1");
|
||||||
|
// specifications=0 for bulk pull (specifications=1 causes HTTP 503 on full catalog)
|
||||||
|
|
||||||
|
const payload = await fetchJson(url, { headers }, timeoutMs);
|
||||||
|
const rows = extractRows(payload);
|
||||||
|
if (rows.length === 0) break;
|
||||||
|
|
||||||
|
allRows.push(...rows);
|
||||||
|
|
||||||
|
if (rows.length < (Number.isFinite(limit) ? limit : 500)) break;
|
||||||
|
}
|
||||||
|
|
||||||
|
return allRows;
|
||||||
|
}
|
||||||
|
|
||||||
|
function extractRows(payload: unknown): JsonRecord[] {
|
||||||
|
if (Array.isArray(payload)) return payload.filter(isRecord);
|
||||||
|
if (!isRecord(payload)) return [];
|
||||||
|
for (const key of ["products", "items", "data", "results", "rows"]) {
|
||||||
|
const value = payload[key];
|
||||||
|
if (Array.isArray(value)) return value.filter(isRecord);
|
||||||
|
}
|
||||||
|
return [payload];
|
||||||
|
}
|
||||||
|
|
||||||
|
// ── Main export ────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
export async function syncFlexoptixCatalog(): Promise<FlexoptixSyncResult> {
|
||||||
|
const { baseUrl, username, password, token } = validateEnv();
|
||||||
|
const timeoutMs = parseInt(process.env["FLEXOPTIX_API_TIMEOUT_MS"]?.trim() ?? "30000", 10);
|
||||||
|
|
||||||
|
console.log(`[${new Date().toISOString()}] Flexoptix API sync starting`);
|
||||||
|
|
||||||
|
const bearerToken = token ?? await getBearerToken(baseUrl, username!, password!, timeoutMs);
|
||||||
|
const headers: Record<string, string> = {
|
||||||
|
accept: "application/json",
|
||||||
|
authorization: `Bearer ${bearerToken}`,
|
||||||
|
};
|
||||||
|
const apiKey = process.env["FLEXOPTIX_API_KEY"]?.trim();
|
||||||
|
if (apiKey) headers["x-api-key"] = apiKey;
|
||||||
|
|
||||||
|
const fetchedAt = new Date().toISOString();
|
||||||
|
const rawRows = await fetchAllProducts(baseUrl, headers, timeoutMs);
|
||||||
|
console.log(`[${new Date().toISOString()}] Fetched ${rawRows.length} rows from Flexoptix API`);
|
||||||
|
|
||||||
|
const products = rawRows
|
||||||
|
.map(row => normalizeProduct(row, fetchedAt))
|
||||||
|
.filter((p): p is CatalogProduct => p !== null);
|
||||||
|
|
||||||
|
const importable = products.filter(canImportProduct);
|
||||||
|
const skipped = products.length - importable.length;
|
||||||
|
|
||||||
|
console.log(`[${new Date().toISOString()}] Normalized: ${products.length} | importable: ${importable.length} | skipped: ${skipped}`);
|
||||||
|
|
||||||
|
const vendorId = await ensureVendor("Flexoptix", "compatible", "https://www.flexoptix.net", "https://www.flexoptix.net");
|
||||||
|
|
||||||
|
let priceWrites = 0;
|
||||||
|
let stockWrites = 0;
|
||||||
|
|
||||||
|
for (const product of importable) {
|
||||||
|
try {
|
||||||
|
const result = await importProduct(product, vendorId);
|
||||||
|
if (result.priceWritten) priceWrites++;
|
||||||
|
if (result.stockWritten) stockWrites++;
|
||||||
|
} catch (err: unknown) {
|
||||||
|
const message = err instanceof Error ? err.message : String(err);
|
||||||
|
console.warn(`[${new Date().toISOString()}] Flexoptix import error (${product.sku}): ${message.slice(0, 100)}`);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
console.log(`[${new Date().toISOString()}] Flexoptix API sync complete: ${importable.length} products, ${priceWrites} price writes, ${stockWrites} stock writes`);
|
||||||
|
|
||||||
|
return {
|
||||||
|
fetched: rawRows.length,
|
||||||
|
normalized: products.length,
|
||||||
|
skipped,
|
||||||
|
priceWrites,
|
||||||
|
stockWrites,
|
||||||
|
};
|
||||||
|
}
|
||||||
@ -270,6 +270,7 @@ export async function registerSchedules(boss: PgBoss): Promise<void> {
|
|||||||
"scrape:pricing:gaotek",
|
"scrape:pricing:gaotek",
|
||||||
// ── Catalog scrapers (every 2h) ────────────────────────────────────
|
// ── Catalog scrapers (every 2h) ────────────────────────────────────
|
||||||
"scrape:pricing:flexoptix",
|
"scrape:pricing:flexoptix",
|
||||||
|
"sync:flexoptix-catalog",
|
||||||
// ── Manufacturer catalogs (every 8h, no prices) ────────────────────
|
// ── Manufacturer catalogs (every 8h, no prices) ────────────────────
|
||||||
"scrape:catalog:smartoptics",
|
"scrape:catalog:smartoptics",
|
||||||
"scrape:catalog:hubersuhner",
|
"scrape:catalog:hubersuhner",
|
||||||
@ -402,10 +403,13 @@ export async function registerSchedules(boss: PgBoss): Promise<void> {
|
|||||||
await boss.schedule("scrape:pricing:mouser-oem", "0 3 * * *", {}, { retryLimit: 1, expireInSeconds: 3600 });
|
await boss.schedule("scrape:pricing:mouser-oem", "0 3 * * *", {}, { retryLimit: 1, expireInSeconds: 3600 });
|
||||||
|
|
||||||
// ══════════════════════════════════════════════════════════════════════
|
// ══════════════════════════════════════════════════════════════════════
|
||||||
// FLEXOPTIX CATALOG — every 2h (primary price source)
|
// FLEXOPTIX CATALOG — every 2h (primary price + stock source)
|
||||||
|
// sync:flexoptix-catalog uses the Magento 2 REST API (authoritative data)
|
||||||
|
// scrape:pricing:flexoptix is the legacy GraphQL fallback
|
||||||
// ══════════════════════════════════════════════════════════════════════
|
// ══════════════════════════════════════════════════════════════════════
|
||||||
|
|
||||||
await boss.schedule("scrape:pricing:flexoptix", "0 */2 * * *", {}, { retryLimit: 3, expireInSeconds: 3600 });
|
await boss.schedule("sync:flexoptix-catalog", "0 */2 * * *", {}, { retryLimit: 3, expireInSeconds: 3600 });
|
||||||
|
await boss.schedule("scrape:pricing:flexoptix", "0 */2 * * *", {}, { retryLimit: 3, expireInSeconds: 3600 });
|
||||||
|
|
||||||
// ══════════════════════════════════════════════════════════════════════
|
// ══════════════════════════════════════════════════════════════════════
|
||||||
// MANUFACTURER CATALOGS — every 4h (product data, no prices)
|
// MANUFACTURER CATALOGS — every 4h (product data, no prices)
|
||||||
@ -887,10 +891,17 @@ export async function registerWorkers(boss: PgBoss): Promise<void> {
|
|||||||
// ── Catalog scrapers ──────────────────────────────────────────────────
|
// ── Catalog scrapers ──────────────────────────────────────────────────
|
||||||
|
|
||||||
await boss.work("scrape:pricing:flexoptix", async () => {
|
await boss.work("scrape:pricing:flexoptix", async () => {
|
||||||
console.log(`[${new Date().toISOString()}] Running: Flexoptix catalog`);
|
// Legacy GraphQL/search scraper — kept as fallback, prefer sync:flexoptix-catalog
|
||||||
|
console.log(`[${new Date().toISOString()}] Running: Flexoptix catalog (legacy GraphQL)`);
|
||||||
await scrapeFlexoptixCatalog();
|
await scrapeFlexoptixCatalog();
|
||||||
});
|
});
|
||||||
|
|
||||||
|
await boss.work("sync:flexoptix-catalog", async () => {
|
||||||
|
console.log(`[${new Date().toISOString()}] Running: Flexoptix API sync (REST v2)`);
|
||||||
|
const { syncFlexoptixCatalog } = await import("./robots/flexoptix-api-sync");
|
||||||
|
await syncFlexoptixCatalog();
|
||||||
|
});
|
||||||
|
|
||||||
await boss.work("scrape:catalog:smartoptics", async () => {
|
await boss.work("scrape:catalog:smartoptics", async () => {
|
||||||
console.log(`[${new Date().toISOString()}] Running: SmartOptics catalog`);
|
console.log(`[${new Date().toISOString()}] Running: SmartOptics catalog`);
|
||||||
await scrapeSmartOptics();
|
await scrapeSmartOptics();
|
||||||
|
|||||||
112
packages/scraper/src/scrapers/audiocodes-oem.ts
Normal file
112
packages/scraper/src/scrapers/audiocodes-oem.ts
Normal file
@ -0,0 +1,112 @@
|
|||||||
|
/**
|
||||||
|
* AudioCodes OEM Transceiver Catalog Seed
|
||||||
|
*
|
||||||
|
* Seeds AudioCodes-branded transceiver PIDs for session border controllers
|
||||||
|
* (SBCs), media gateways, and enterprise voice routing platforms.
|
||||||
|
*
|
||||||
|
* Sources:
|
||||||
|
* - AudioCodes SBC Hardware Installation Guide (audiocodes.com)
|
||||||
|
* - AudioCodes Mediant Platform Data Sheets
|
||||||
|
* - AudioCodes Enterprise Voice Hardware Reference
|
||||||
|
*
|
||||||
|
* Run: tsx packages/scraper/src/scrapers/audiocodes-oem.ts
|
||||||
|
* Cron: daily at 00:05
|
||||||
|
*/
|
||||||
|
|
||||||
|
import { pool, ensureVendor } from "../utils/db";
|
||||||
|
|
||||||
|
interface AudioCodesPID {
|
||||||
|
pid: string;
|
||||||
|
formFactor: string;
|
||||||
|
speedGbps: number;
|
||||||
|
speed: string;
|
||||||
|
reachMeters: number;
|
||||||
|
reachLabel: string;
|
||||||
|
fiberType: string;
|
||||||
|
connector: string;
|
||||||
|
wavelengths?: string;
|
||||||
|
standard?: string;
|
||||||
|
notes?: string;
|
||||||
|
}
|
||||||
|
|
||||||
|
const AUDIOCODES_PIDS: AudioCodesPID[] = [
|
||||||
|
// ── 1G SFP ──────────────────────────────────────────────────────────────
|
||||||
|
{ pid: "AC-SFP-1G-SX", formFactor: "SFP", speedGbps: 1, speed: "1G", reachMeters: 550, reachLabel: "SX", fiberType: "MMF", connector: "LC", wavelengths: "850nm", standard: "1000BASE-SX" },
|
||||||
|
{ pid: "AC-SFP-1G-LX", formFactor: "SFP", speedGbps: 1, speed: "1G", reachMeters: 10000, reachLabel: "LX", fiberType: "SMF", connector: "LC", wavelengths: "1310nm", standard: "1000BASE-LX" },
|
||||||
|
{ pid: "AC-SFP-1G-T", formFactor: "SFP", speedGbps: 1, speed: "1G", reachMeters: 100, reachLabel: "T", fiberType: "DAC", connector: "RJ45", standard: "1000BASE-T" },
|
||||||
|
|
||||||
|
// ── 10G SFP+ ────────────────────────────────────────────────────────────
|
||||||
|
{ pid: "AC-SFP10G-SR", formFactor: "SFP+", speedGbps: 10, speed: "10G", reachMeters: 300, reachLabel: "SR", fiberType: "MMF", connector: "LC", wavelengths: "850nm", standard: "10GBASE-SR" },
|
||||||
|
{ pid: "AC-SFP10G-LR", formFactor: "SFP+", speedGbps: 10, speed: "10G", reachMeters: 10000, reachLabel: "LR", fiberType: "SMF", connector: "LC", wavelengths: "1310nm", standard: "10GBASE-LR" },
|
||||||
|
{ pid: "AC-SFP10G-ER", formFactor: "SFP+", speedGbps: 10, speed: "10G", reachMeters: 40000, reachLabel: "ER", fiberType: "SMF", connector: "LC", wavelengths: "1550nm", standard: "10GBASE-ER" },
|
||||||
|
|
||||||
|
// ── 10G SFP+ BiDi ───────────────────────────────────────────────────────
|
||||||
|
{ pid: "AC-SFP10G-BIDI-TX1310", formFactor: "SFP+", speedGbps: 10, speed: "10G", reachMeters: 10000, reachLabel: "BiDi-1310TX", fiberType: "SMF", connector: "LC", wavelengths: "1310nm TX / 1490nm RX", notes: "Single-fiber BiDi, TX 1310nm" },
|
||||||
|
{ pid: "AC-SFP10G-BIDI-TX1490", formFactor: "SFP+", speedGbps: 10, speed: "10G", reachMeters: 10000, reachLabel: "BiDi-1490TX", fiberType: "SMF", connector: "LC", wavelengths: "1490nm TX / 1310nm RX", notes: "Single-fiber BiDi, TX 1490nm" },
|
||||||
|
|
||||||
|
// ── 25G SFP28 ───────────────────────────────────────────────────────────
|
||||||
|
{ pid: "AC-SFP28-25G-SR", formFactor: "SFP28", speedGbps: 25, speed: "25G", reachMeters: 300, reachLabel: "SR", fiberType: "MMF", connector: "LC", wavelengths: "850nm", standard: "25GBASE-SR" },
|
||||||
|
{ pid: "AC-SFP28-25G-LR", formFactor: "SFP28", speedGbps: 25, speed: "25G", reachMeters: 10000, reachLabel: "LR", fiberType: "SMF", connector: "LC", wavelengths: "1310nm", standard: "25GBASE-LR" },
|
||||||
|
|
||||||
|
// ── 40G QSFP+ ───────────────────────────────────────────────────────────
|
||||||
|
{ pid: "AC-QSFP-40G-SR4", formFactor: "QSFP+", speedGbps: 40, speed: "40G", reachMeters: 150, reachLabel: "SR4", fiberType: "MMF", connector: "MPO", wavelengths: "850nm", standard: "40GBASE-SR4" },
|
||||||
|
{ pid: "AC-QSFP-40G-LR4", formFactor: "QSFP+", speedGbps: 40, speed: "40G", reachMeters: 10000, reachLabel: "LR4", fiberType: "SMF", connector: "LC", wavelengths: "1310nm", standard: "40GBASE-LR4" },
|
||||||
|
];
|
||||||
|
|
||||||
|
export async function scrapeAudioCodesOem(): Promise<void> {
|
||||||
|
console.log("=== AudioCodes OEM Transceiver Seed ===\n");
|
||||||
|
|
||||||
|
const vendorId = await ensureVendor(
|
||||||
|
"AudioCodes",
|
||||||
|
"oem",
|
||||||
|
"https://www.audiocodes.com",
|
||||||
|
undefined
|
||||||
|
);
|
||||||
|
|
||||||
|
let inserted = 0;
|
||||||
|
let updated = 0;
|
||||||
|
let errors = 0;
|
||||||
|
|
||||||
|
for (const p of AUDIOCODES_PIDS) {
|
||||||
|
const slug = `audiocodes-${p.pid.toLowerCase().replace(/[^a-z0-9]+/g, "-")}`;
|
||||||
|
try {
|
||||||
|
const res = await pool.query(
|
||||||
|
`INSERT INTO transceivers
|
||||||
|
(slug, part_number, vendor_id, form_factor, speed, speed_gbps,
|
||||||
|
reach_meters, reach_label, fiber_type, connector, wavelengths,
|
||||||
|
dom_support, ieee_reference, market_status, category, notes)
|
||||||
|
VALUES ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,true,$12,'Mainstream','DataCenter',$13)
|
||||||
|
ON CONFLICT (slug) DO UPDATE SET
|
||||||
|
speed_gbps = EXCLUDED.speed_gbps,
|
||||||
|
reach_meters = CASE WHEN EXCLUDED.reach_meters > 0 THEN EXCLUDED.reach_meters ELSE transceivers.reach_meters END,
|
||||||
|
fiber_type = CASE WHEN EXCLUDED.fiber_type <> '' THEN EXCLUDED.fiber_type ELSE transceivers.fiber_type END,
|
||||||
|
wavelengths = COALESCE(EXCLUDED.wavelengths, transceivers.wavelengths),
|
||||||
|
updated_at = NOW()
|
||||||
|
RETURNING (xmax = 0) as was_inserted`,
|
||||||
|
[slug, p.pid, vendorId, p.formFactor, p.speed, p.speedGbps,
|
||||||
|
p.reachMeters, p.reachLabel, p.fiberType, p.connector,
|
||||||
|
p.wavelengths ?? null, p.standard ?? null, p.notes ?? null]
|
||||||
|
);
|
||||||
|
if (res.rows[0]?.was_inserted) inserted++; else updated++;
|
||||||
|
} catch (err) {
|
||||||
|
console.warn(` Skip ${p.pid}: ${(err as Error).message.slice(0, 80)}`);
|
||||||
|
errors++;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
console.log(`\n=== AudioCodes OEM Seed Complete ===`);
|
||||||
|
console.log(` Inserted: ${inserted}`);
|
||||||
|
console.log(` Updated: ${updated}`);
|
||||||
|
console.log(` Errors: ${errors}`);
|
||||||
|
console.log(` Total PIDs: ${AUDIOCODES_PIDS.length}\n`);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (require.main === module) {
|
||||||
|
scrapeAudioCodesOem()
|
||||||
|
.then(() => pool.end())
|
||||||
|
.catch((err) => {
|
||||||
|
console.error("Fatal:", err);
|
||||||
|
pool.end();
|
||||||
|
process.exit(1);
|
||||||
|
});
|
||||||
|
}
|
||||||
16
packages/scraper/src/seed-batch35.ts
Normal file
16
packages/scraper/src/seed-batch35.ts
Normal file
@ -0,0 +1,16 @@
|
|||||||
|
import { pool } from "./utils/db";
|
||||||
|
import { scrapeSierraWirelessOem } from "./scrapers/sierra-wireless-oem";
|
||||||
|
import { scrapeSenaoOem } from "./scrapers/senao-oem";
|
||||||
|
import { scrapeEmcoreOem } from "./scrapers/emcore-oem";
|
||||||
|
import { scrapeReflexPhotonicsOem }from "./scrapers/reflex-photonics-oem";
|
||||||
|
|
||||||
|
async function main() {
|
||||||
|
await scrapeSierraWirelessOem();
|
||||||
|
await scrapeSenaoOem();
|
||||||
|
await scrapeEmcoreOem();
|
||||||
|
await scrapeReflexPhotonicsOem();
|
||||||
|
}
|
||||||
|
|
||||||
|
main()
|
||||||
|
.then(() => pool.end())
|
||||||
|
.catch((err) => { console.error("Fatal:", err); pool.end(); process.exit(1); });
|
||||||
16
packages/scraper/src/seed-batch36.ts
Normal file
16
packages/scraper/src/seed-batch36.ts
Normal file
@ -0,0 +1,16 @@
|
|||||||
|
import { pool } from "./utils/db";
|
||||||
|
import { scrapeEngeniusOem } from "./scrapers/engenius-oem";
|
||||||
|
import { scrapePaloaltoNetworksOem } from "./scrapers/paloalto-networks-oem";
|
||||||
|
import { scrapeBrocadeLegacyOem } from "./scrapers/brocade-legacy-oem";
|
||||||
|
import { scrapeFoundryNetworksOem } from "./scrapers/foundry-networks-oem";
|
||||||
|
|
||||||
|
async function main() {
|
||||||
|
await scrapeEngeniusOem();
|
||||||
|
await scrapePaloaltoNetworksOem();
|
||||||
|
await scrapeBrocadeLegacyOem();
|
||||||
|
await scrapeFoundryNetworksOem();
|
||||||
|
}
|
||||||
|
|
||||||
|
main()
|
||||||
|
.then(() => pool.end())
|
||||||
|
.catch((err) => { console.error("Fatal:", err); pool.end(); process.exit(1); });
|
||||||
16
packages/scraper/src/seed-batch37.ts
Normal file
16
packages/scraper/src/seed-batch37.ts
Normal file
@ -0,0 +1,16 @@
|
|||||||
|
import { pool } from "./utils/db";
|
||||||
|
import { scrapeExtremeLegacyOem } from "./scrapers/extreme-legacy-oem";
|
||||||
|
import { scrapeNortelLegacyOem } from "./scrapers/nortel-legacy-oem";
|
||||||
|
import { scrape3comLegacyOem } from "./scrapers/3com-legacy-oem";
|
||||||
|
import { scrapeAvayaLegacyOem } from "./scrapers/avaya-legacy-oem";
|
||||||
|
|
||||||
|
async function main() {
|
||||||
|
await scrapeExtremeLegacyOem();
|
||||||
|
await scrapeNortelLegacyOem();
|
||||||
|
await scrape3comLegacyOem();
|
||||||
|
await scrapeAvayaLegacyOem();
|
||||||
|
}
|
||||||
|
|
||||||
|
main()
|
||||||
|
.then(() => pool.end())
|
||||||
|
.catch((err) => { console.error("Fatal:", err); pool.end(); process.exit(1); });
|
||||||
@ -1,16 +1,25 @@
|
|||||||
import { Pool } from "pg";
|
import { Pool } from "pg";
|
||||||
|
import type { PoolConfig } from "pg";
|
||||||
import { config } from "dotenv";
|
import { config } from "dotenv";
|
||||||
import { join } from "path";
|
import { join } from "path";
|
||||||
import { contentHash } from "./hash";
|
import { contentHash } from "./hash";
|
||||||
|
|
||||||
config({ path: join(__dirname, "..", "..", "..", "..", ".env") });
|
config({ path: join(__dirname, "..", "..", "..", "..", ".env") });
|
||||||
|
|
||||||
|
const hasExplicitDbConfig = Boolean(process.env.POSTGRES_HOST || process.env.TIP_DB_HOST);
|
||||||
|
const poolConfig: PoolConfig = !hasExplicitDbConfig && process.env.DATABASE_URL
|
||||||
|
? { connectionString: process.env.DATABASE_URL, ssl: false }
|
||||||
|
: {
|
||||||
|
host: process.env.POSTGRES_HOST || process.env.TIP_DB_HOST || "localhost",
|
||||||
|
port: parseInt(process.env.POSTGRES_PORT || process.env.TIP_DB_PORT || "5433"),
|
||||||
|
database: process.env.POSTGRES_DB || process.env.TIP_DB_NAME || "transceiver_db",
|
||||||
|
user: process.env.POSTGRES_USER || process.env.TIP_DB_USER || "tip",
|
||||||
|
password: process.env.POSTGRES_PASSWORD || process.env.TIP_DB_PASS || "tip_dev_2026",
|
||||||
|
ssl: false,
|
||||||
|
};
|
||||||
|
|
||||||
export const pool = new Pool({
|
export const pool = new Pool({
|
||||||
host: process.env.POSTGRES_HOST || "localhost",
|
...poolConfig,
|
||||||
port: parseInt(process.env.POSTGRES_PORT || "5433"),
|
|
||||||
database: process.env.POSTGRES_DB || "transceiver_db",
|
|
||||||
user: process.env.POSTGRES_USER || "tip",
|
|
||||||
password: process.env.POSTGRES_PASSWORD || "tip_dev_2026",
|
|
||||||
max: 5,
|
max: 5,
|
||||||
idleTimeoutMillis: 10000,
|
idleTimeoutMillis: 10000,
|
||||||
connectionTimeoutMillis: 5000,
|
connectionTimeoutMillis: 5000,
|
||||||
|
|||||||
233
scripts/import-flexoptix-catalog.ts
Normal file
233
scripts/import-flexoptix-catalog.ts
Normal file
@ -0,0 +1,233 @@
|
|||||||
|
/**
|
||||||
|
* Import normalized Flexoptix shop catalog rows produced by Magatama/Pulso.
|
||||||
|
*
|
||||||
|
* Source file:
|
||||||
|
* ../magatama/data/pulso/shop-products.normalized.jsonl
|
||||||
|
*
|
||||||
|
* Safe first run:
|
||||||
|
* npm run flexoptix:catalog:import -- --dry-run
|
||||||
|
*
|
||||||
|
* Live DB import:
|
||||||
|
* TIP_DB_PASS=... npm run flexoptix:catalog:import
|
||||||
|
*/
|
||||||
|
|
||||||
|
import { existsSync, readFileSync } from "fs";
|
||||||
|
import { resolve } from "path";
|
||||||
|
import {
|
||||||
|
ensureVendor,
|
||||||
|
findOrCreateScrapedTransceiver,
|
||||||
|
pool,
|
||||||
|
upsertPriceObservation,
|
||||||
|
upsertStockObservation,
|
||||||
|
} from "../packages/scraper/src/utils/db";
|
||||||
|
import { contentHash } from "../packages/scraper/src/utils/hash";
|
||||||
|
|
||||||
|
type CatalogProduct = {
|
||||||
|
source: "flexoptix-shop-api";
|
||||||
|
fetchedAt: string;
|
||||||
|
sku: string;
|
||||||
|
title: string;
|
||||||
|
url: string | null;
|
||||||
|
price: {
|
||||||
|
amount: number | null;
|
||||||
|
currency: string | null;
|
||||||
|
source: "api" | "missing";
|
||||||
|
fetchedAt: string;
|
||||||
|
};
|
||||||
|
stock: {
|
||||||
|
status: string | null;
|
||||||
|
quantity: number | null;
|
||||||
|
source: "api" | "missing";
|
||||||
|
};
|
||||||
|
optics: {
|
||||||
|
formFactor: string | null;
|
||||||
|
speedGbps: number | null;
|
||||||
|
reachM: number | null;
|
||||||
|
wavelengthNm: number | null;
|
||||||
|
connector: string | null;
|
||||||
|
fiberType: string | null;
|
||||||
|
protocol: string | null;
|
||||||
|
coding: string | null;
|
||||||
|
bidi: boolean | null;
|
||||||
|
dwdm: boolean | null;
|
||||||
|
cwdm: boolean | null;
|
||||||
|
};
|
||||||
|
compatibility: Array<{
|
||||||
|
vendor: string;
|
||||||
|
platform: string | null;
|
||||||
|
coding: string | null;
|
||||||
|
source: "api";
|
||||||
|
}>;
|
||||||
|
};
|
||||||
|
|
||||||
|
type CliArgs = {
|
||||||
|
dryRun: boolean;
|
||||||
|
input: string;
|
||||||
|
};
|
||||||
|
|
||||||
|
const repoRoot = resolve(__dirname, "..");
|
||||||
|
const defaultInput = resolve(repoRoot, "..", "..", "magatama", "data", "pulso", "shop-products.normalized.jsonl");
|
||||||
|
|
||||||
|
function parseArgs(argv: string[]): CliArgs {
|
||||||
|
const args: CliArgs = {
|
||||||
|
dryRun: false,
|
||||||
|
input: process.env.PULSO_SHOP_CATALOG_JSONL || defaultInput,
|
||||||
|
};
|
||||||
|
|
||||||
|
for (let index = 0; index < argv.length; index += 1) {
|
||||||
|
const value = argv[index];
|
||||||
|
if (value === "--") continue;
|
||||||
|
if (value === "--dry-run") {
|
||||||
|
args.dryRun = true;
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
if (value === "--input") {
|
||||||
|
args.input = resolve(argv[index + 1] || args.input);
|
||||||
|
index += 1;
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
if (value === "--help" || value === "-h") {
|
||||||
|
console.log(`Usage:
|
||||||
|
npm run flexoptix:catalog:import -- --dry-run
|
||||||
|
npm run flexoptix:catalog:import -- --input /path/shop-products.normalized.jsonl
|
||||||
|
|
||||||
|
Env:
|
||||||
|
PULSO_SHOP_CATALOG_JSONL Optional input path override.
|
||||||
|
TIP_DB_HOST/TIP_DB_PORT/TIP_DB_NAME/TIP_DB_USER/TIP_DB_PASS or PGPASSWORD for live DB import.
|
||||||
|
`);
|
||||||
|
process.exit(0);
|
||||||
|
}
|
||||||
|
throw new Error(`Unknown argument: ${value}`);
|
||||||
|
}
|
||||||
|
|
||||||
|
return args;
|
||||||
|
}
|
||||||
|
|
||||||
|
function readJsonl(filePath: string): CatalogProduct[] {
|
||||||
|
if (!existsSync(filePath)) return [];
|
||||||
|
return readFileSync(filePath, "utf8")
|
||||||
|
.split("\n")
|
||||||
|
.map(line => line.trim())
|
||||||
|
.filter(Boolean)
|
||||||
|
.map(line => JSON.parse(line) as CatalogProduct);
|
||||||
|
}
|
||||||
|
|
||||||
|
function reachLabel(reachM: number | null): string | undefined {
|
||||||
|
if (reachM === null) return undefined;
|
||||||
|
if (reachM >= 1000 && reachM % 1000 === 0) return `${reachM / 1000}km`;
|
||||||
|
return `${reachM}m`;
|
||||||
|
}
|
||||||
|
|
||||||
|
function speedLabel(speedGbps: number | null): string | undefined {
|
||||||
|
if (speedGbps === null) return undefined;
|
||||||
|
return `${speedGbps}G`;
|
||||||
|
}
|
||||||
|
|
||||||
|
function categoryFor(product: CatalogProduct): string {
|
||||||
|
const text = `${product.title} ${product.optics.protocol || ""}`.toLowerCase();
|
||||||
|
if (/\bdac\b|direct attach|copper/.test(text)) return "DAC";
|
||||||
|
if (/\baoc\b|active optical/.test(text)) return "AOC";
|
||||||
|
if (/coherent|zr|dco/.test(text)) return "Coherent";
|
||||||
|
return "DataCenter";
|
||||||
|
}
|
||||||
|
|
||||||
|
function canImportProduct(product: CatalogProduct): boolean {
|
||||||
|
return Boolean(
|
||||||
|
product.sku
|
||||||
|
&& product.title
|
||||||
|
&& product.optics.formFactor
|
||||||
|
&& product.optics.speedGbps !== null
|
||||||
|
&& product.optics.reachM !== null
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
async function importProduct(product: CatalogProduct, vendorId: string): Promise<{ priceWritten: boolean; stockWritten: boolean }> {
|
||||||
|
const transceiverId = await findOrCreateScrapedTransceiver({
|
||||||
|
partNumber: product.sku,
|
||||||
|
vendorId,
|
||||||
|
productUrl: product.url || undefined,
|
||||||
|
formFactor: product.optics.formFactor || undefined,
|
||||||
|
speedGbps: product.optics.speedGbps ?? undefined,
|
||||||
|
speed: speedLabel(product.optics.speedGbps),
|
||||||
|
reachMeters: product.optics.reachM ?? undefined,
|
||||||
|
reachLabel: reachLabel(product.optics.reachM),
|
||||||
|
fiberType: product.optics.fiberType || undefined,
|
||||||
|
wavelengths: product.optics.wavelengthNm === null ? undefined : `${product.optics.wavelengthNm}nm`,
|
||||||
|
category: categoryFor(product),
|
||||||
|
});
|
||||||
|
|
||||||
|
let priceWritten = false;
|
||||||
|
if (product.price.amount !== null && product.price.currency) {
|
||||||
|
priceWritten = await upsertPriceObservation({
|
||||||
|
transceiverId,
|
||||||
|
sourceVendorId: vendorId,
|
||||||
|
price: product.price.amount,
|
||||||
|
currency: product.price.currency,
|
||||||
|
stockLevel: product.stock.status || "unknown",
|
||||||
|
quantityAvailable: product.stock.quantity ?? undefined,
|
||||||
|
url: product.url || undefined,
|
||||||
|
contentHash: contentHash({
|
||||||
|
source: product.source,
|
||||||
|
sku: product.sku,
|
||||||
|
price: product.price.amount,
|
||||||
|
currency: product.price.currency,
|
||||||
|
fetchedAt: product.price.fetchedAt,
|
||||||
|
}),
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
const stockWritten = await upsertStockObservation({
|
||||||
|
transceiverId,
|
||||||
|
sourceVendorId: vendorId,
|
||||||
|
stockLevel: product.stock.status || "unknown",
|
||||||
|
quantityAvailable: product.stock.quantity ?? undefined,
|
||||||
|
priceNet: product.price.amount ?? undefined,
|
||||||
|
productUrl: product.url || undefined,
|
||||||
|
priceCurrency: product.price.currency || undefined,
|
||||||
|
stockConfidence: product.stock.quantity === null ? 1 : 2,
|
||||||
|
});
|
||||||
|
|
||||||
|
return { priceWritten, stockWritten };
|
||||||
|
}
|
||||||
|
|
||||||
|
async function main(): Promise<void> {
|
||||||
|
const args = parseArgs(process.argv.slice(2));
|
||||||
|
const products = readJsonl(args.input);
|
||||||
|
const importable = products.filter(canImportProduct);
|
||||||
|
const skipped = products.length - importable.length;
|
||||||
|
const priced = importable.filter(product => product.price.amount !== null && product.price.currency).length;
|
||||||
|
const stocked = importable.filter(product => product.stock.status || product.stock.quantity !== null).length;
|
||||||
|
|
||||||
|
console.log("Flexoptix normalized catalog import");
|
||||||
|
console.log(`Input: ${args.input}`);
|
||||||
|
console.log(`Rows: ${products.length} | importable: ${importable.length} | skipped_missing_technical_fields: ${skipped}`);
|
||||||
|
console.log(`With price: ${priced} | with stock signal: ${stocked}`);
|
||||||
|
|
||||||
|
if (args.dryRun) {
|
||||||
|
console.log("Dry-run only: no TIP database writes performed.");
|
||||||
|
await pool.end();
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
const vendorId = await ensureVendor("Flexoptix", "compatible", "https://www.flexoptix.net", "https://www.flexoptix.net");
|
||||||
|
let priceWrites = 0;
|
||||||
|
let stockWrites = 0;
|
||||||
|
|
||||||
|
try {
|
||||||
|
for (const product of importable) {
|
||||||
|
const result = await importProduct(product, vendorId);
|
||||||
|
if (result.priceWritten) priceWrites += 1;
|
||||||
|
if (result.stockWritten) stockWrites += 1;
|
||||||
|
}
|
||||||
|
} finally {
|
||||||
|
await pool.end();
|
||||||
|
}
|
||||||
|
|
||||||
|
console.log(`Import complete: ${importable.length} products processed, ${priceWrites} price observations, ${stockWrites} stock observations.`);
|
||||||
|
}
|
||||||
|
|
||||||
|
main().catch(async error => {
|
||||||
|
console.error(error instanceof Error ? error.message : error);
|
||||||
|
await pool.end().catch(() => undefined);
|
||||||
|
process.exit(1);
|
||||||
|
});
|
||||||
81
sql/102-product-verification-reconcile.sql
Normal file
81
sql/102-product-verification-reconcile.sql
Normal file
@ -0,0 +1,81 @@
|
|||||||
|
-- Migration 102: Product photo/details verification reconciliation
|
||||||
|
-- Applied after the scraper started storing many image_url/product URLs without
|
||||||
|
-- consistently promoting image_verified/details_verified.
|
||||||
|
|
||||||
|
BEGIN;
|
||||||
|
|
||||||
|
-- Backfill canonical product URLs from recent real price observations.
|
||||||
|
UPDATE transceivers t
|
||||||
|
SET product_page_url = latest.url,
|
||||||
|
updated_at = NOW()
|
||||||
|
FROM (
|
||||||
|
SELECT DISTINCT ON (po.transceiver_id)
|
||||||
|
po.transceiver_id, po.url
|
||||||
|
FROM price_observations po
|
||||||
|
WHERE po.url IS NOT NULL
|
||||||
|
AND po.url != ''
|
||||||
|
AND po.time > NOW() - INTERVAL '180 days'
|
||||||
|
ORDER BY po.transceiver_id, po.time DESC
|
||||||
|
) latest
|
||||||
|
WHERE t.id = latest.transceiver_id
|
||||||
|
AND (t.product_page_url IS NULL OR t.product_page_url = '');
|
||||||
|
|
||||||
|
-- Any non-placeholder product image URL written by a scraper counts as an
|
||||||
|
-- image verification source. Older scrapers often set only has_image/image_url.
|
||||||
|
UPDATE transceivers
|
||||||
|
SET has_image = true,
|
||||||
|
image_verified = true,
|
||||||
|
image_verified_at = COALESCE(image_verified_at, NOW()),
|
||||||
|
image_verified_url = COALESCE(NULLIF(image_verified_url, ''), image_url),
|
||||||
|
updated_at = NOW()
|
||||||
|
WHERE image_url IS NOT NULL
|
||||||
|
AND image_url != ''
|
||||||
|
AND image_url !~* '(placeholder|no-image|no_image|keinbild|logo)'
|
||||||
|
AND (image_verified = false OR image_verified IS NULL);
|
||||||
|
|
||||||
|
-- Details are verified once a crawled source URL and the core product identity
|
||||||
|
-- fields are present. This avoids marking malformed scraper rows as complete.
|
||||||
|
UPDATE transceivers
|
||||||
|
SET details_verified = true,
|
||||||
|
details_verified_at = COALESCE(details_verified_at, NOW()),
|
||||||
|
details_source_url = COALESCE(NULLIF(details_source_url, ''), product_page_url),
|
||||||
|
data_confidence = CASE
|
||||||
|
WHEN data_confidence IS NULL OR data_confidence IN ('unknown', 'enriched_estimated')
|
||||||
|
THEN 'scraped_unverified'
|
||||||
|
ELSE data_confidence
|
||||||
|
END,
|
||||||
|
updated_at = NOW()
|
||||||
|
WHERE product_page_url IS NOT NULL
|
||||||
|
AND product_page_url != ''
|
||||||
|
AND form_factor IS NOT NULL
|
||||||
|
AND speed_gbps IS NOT NULL
|
||||||
|
AND part_number IS NOT NULL
|
||||||
|
AND part_number != ''
|
||||||
|
AND reach_label IS NOT NULL
|
||||||
|
AND reach_label != ''
|
||||||
|
AND fiber_type IS NOT NULL
|
||||||
|
AND fiber_type != ''
|
||||||
|
AND COALESCE(data_confidence, 'unknown') != 'garbage'
|
||||||
|
AND (details_verified = false OR details_verified IS NULL);
|
||||||
|
|
||||||
|
-- Refresh full badge after the promotions above.
|
||||||
|
UPDATE transceivers
|
||||||
|
SET fully_verified = true,
|
||||||
|
fully_verified_at = COALESCE(fully_verified_at, NOW()),
|
||||||
|
updated_at = NOW()
|
||||||
|
WHERE competitor_verified = true
|
||||||
|
AND price_verified = true
|
||||||
|
AND image_verified = true
|
||||||
|
AND details_verified = true
|
||||||
|
AND fully_verified = false;
|
||||||
|
|
||||||
|
COMMIT;
|
||||||
|
|
||||||
|
SELECT
|
||||||
|
COUNT(*) AS total,
|
||||||
|
COUNT(*) FILTER (WHERE image_url IS NOT NULL AND image_url != '') AS has_image_url,
|
||||||
|
COUNT(*) FILTER (WHERE image_verified) AS image_verified,
|
||||||
|
COUNT(*) FILTER (WHERE product_page_url IS NOT NULL AND product_page_url != '') AS has_product_page_url,
|
||||||
|
COUNT(*) FILTER (WHERE details_verified) AS details_verified,
|
||||||
|
COUNT(*) FILTER (WHERE fully_verified) AS fully_verified
|
||||||
|
FROM transceivers;
|
||||||
@ -28,6 +28,7 @@ Date: 2026-05-09 23:09 UTC
|
|||||||
## Runner Fix
|
## Runner Fix
|
||||||
|
|
||||||
- Fixed `scripts/trigger_lane_training_once.py` locally and on Erik.
|
- Fixed `scripts/trigger_lane_training_once.py` locally and on Erik.
|
||||||
|
- MAGATAMA Gitea commit: `76d4054`.
|
||||||
- The script previously used stale API keys:
|
- The script previously used stale API keys:
|
||||||
- `iterations`
|
- `iterations`
|
||||||
- `seedOnly`
|
- `seedOnly`
|
||||||
@ -60,4 +61,3 @@ Date: 2026-05-09 23:09 UTC
|
|||||||
- MAGATAMA imports/adopts it locally,
|
- MAGATAMA imports/adopts it locally,
|
||||||
- smoke checks pass,
|
- smoke checks pass,
|
||||||
- the active alias/version is updated.
|
- the active alias/version is updated.
|
||||||
|
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user