transceiver-db/scripts/seed-tip-llm-capabilities.ts
Rene Fichtmueller 8e367b3c33 feat: TIP_LLM 5-capability training data + updated system prompt
- Add scripts/seed-tip-llm-capabilities.ts: generator for 34 SFT pairs
  covering all 5 TIP_LLM capabilities (transceiver research, switch
  research, Blog_LLM data evaluation, crawler/scraper design, Hype Cycle)
- Add training-data/tip-llm-capabilities-v1.jsonl: generated output (34 pairs)
- Update tip-learning-pool-build.ts: expanded 5-capability system prompt
  replaces single-line prompt; register capabilities file in files.tip_llm
- Regenerate tip_llm runpod outputs: 12141 raw pairs → 11872 training pairs
  (up from 10654 before capabilities addition)
- Published tip_llm (11872 pairs) + blog_llm (11408 pairs) to HuggingFace
2026-04-26 00:01:21 +02:00

1570 lines
76 KiB
TypeScript
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

/**
* TIP_LLM Capabilities Training Data Generator
* ─────────────────────────────────────────────────────────────────────────────
* Generates synthetic SFT training pairs covering TIP_LLM's 5 core capabilities:
*
* CAP-1 Transceiver research & full data extraction
* CAP-2 Switch research & compatibility mapping
* CAP-3 Data evaluation & quality scoring for Blog_LLM
* CAP-4 Crawler / scraper / robot design & code generation
* CAP-5 Hype Cycle position calculation & forecasting
*
* Output: training-data/tip-llm-capabilities-v1.jsonl (added to files.tip_llm)
* Run: tsx scripts/seed-tip-llm-capabilities.ts
*/
import { createHash } from "crypto";
import { mkdirSync, writeFileSync } from "fs";
import { join } from "path";
const SYSTEM = `You are TIP_LLM — the Transceiver Intelligence Platform's core research, data-engineering, and market-intelligence model.
Your five core capabilities:
CAP-1 · TRANSCEIVER RESEARCH
Research any optical transceiver by part number, vendor, form factor, or speed tier. Extract and normalise: full electrical/optical specs, fiber type, reach, connector, DOM support, temperature range, power budget, vendor pricing, compatibility matrix (switches, line cards), standards compliance (IEEE, OIF, MSA), and known field issues. Output structured JSON or normalised text. Never invent specs — flag unknowns explicitly.
CAP-2 · SWITCH RESEARCH
Research network switches: port density, supported form factors, transceiver compatibility lists, ASIC type, buffer depth, forwarding capacity, SONiC/NOS support, rack unit size, power draw, and vendor pricing. Cross-reference transceivers → switches and vice versa. Identify supported QSFP-DD, OSFP, SFP28 variants per slot. Flag MACsec, FEC, and breakout constraints.
CAP-3 · BLOG LLM DATA EVALUATION
Evaluate raw crawled content, vendor pages, forum posts, and market reports for Blog_LLM ingestion quality. Score on: technical depth (0-10), factual density (0-10), recency (0-10), uniqueness (0-10), writing quality (0-10). Output evaluation JSON with per-dimension scores, an overall recommendation (ACCEPT / REVIEW / REJECT), and a one-line reason. Extract blog-worthy angles and key claims for reuse.
CAP-4 · CRAWLER / SCRAPER / ROBOT DESIGN
Design, plan, and generate production-ready crawlers using Crawlee + Playwright/Puppeteer. For any target URL or data need: identify page structure, write CSS/XPath selectors, handle pagination, rate limits, and bot detection. Output complete TypeScript Crawlee actor code, sitemap strategies, and extraction schemas. Also design lightweight HTTP scrapers (fetch + cheerio) for simpler targets. Flag legal/ToS considerations.
CAP-5 · HYPE CYCLE CALCULATION
Calculate Gartner Hype Cycle position for optical networking technologies using the Norton-Bass diffusion model. Given adoption metrics, vendor announcements, standards maturity, and market pricing trends — compute: innovation trigger probability, peak inflation score, trough depth estimate, and slope-of-enlightenment ETA. Output: phase label, 0100 position score, buy-signal (BUY_NOW / CONSIDER / WAIT / AVOID), and 1224 month forecast.`;
interface Row {
id: string;
source: string;
kind: string;
messages: Array<{ role: string; content: string }>;
}
function makeId(parts: string[]): string {
return createHash("sha256").update(parts.join("\n---\n")).digest("hex").slice(0, 24);
}
function row(user: string, assistant: string): Row {
const messages = [
{ role: "system", content: SYSTEM },
{ role: "user", content: user.trim() },
{ role: "assistant", content: assistant.trim() },
];
return {
id: makeId(["tip_llm", user.trim().slice(0, 200), assistant.trim().slice(0, 200)]),
source: "tip-llm-capabilities-v1",
kind: "sft-jsonl",
messages,
};
}
const rows: Row[] = [];
// ─────────────────────────────────────────────────────────────────────────────
// CAP-1: TRANSCEIVER RESEARCH
// ─────────────────────────────────────────────────────────────────────────────
rows.push(row(
"Research the Cisco QSFP-100G-SR4-S transceiver. Provide full specs, compatibility, pricing, and known issues.",
`{
"part_number": "QSFP-100G-SR4-S",
"vendor": "Cisco",
"form_factor": "QSFP28",
"protocol": "100GBASE-SR4",
"speed_gbps": 100,
"channels": 4,
"channel_rate_gbps": 25,
"fiber_type": "OM4 multimode",
"reach_m": 100,
"reach_label": "SR4",
"connector": "MPO-12",
"wavelength_nm": 850,
"tx_power_dbm": { "min": -8.4, "max": 2.4 },
"rx_sensitivity_dbm": -9.5,
"power_budget_db": 1.9,
"power_consumption_w": 3.5,
"dom_support": true,
"temperature_range_c": "0 to 70 (commercial)",
"standards": ["IEEE 802.3bm", "100GBASE-SR4", "QSFP28 MSA"],
"compatible_switches": [
"Cisco Nexus 9332PQ", "Cisco Nexus 9364C", "Cisco Nexus 3264Q",
"Arista 7050CX3-32S", "Juniper QFX5200-32C"
],
"cisco_pid": "QSFP-100G-SR4-S",
"list_price_usd": 995,
"street_price_usd_approx": 280,
"compatible_alternative": "Flexoptix FP-QSFP28-SR4 (~80 USD)",
"known_issues": [
"Early batches (pre-2019) had MPO ferrule alignment issues causing 1-2dB insertion loss penalty",
"Requires full Cisco IOS-XE 16.9+ for DOM polling on Cat9k"
],
"data_confidence": "HIGH — IEEE spec + vendor datasheet verified",
"sources": ["cisco.com/c/en/us/products/collateral/", "IEEE 802.3bm-2015"]
}`
));
rows.push(row(
"Give me a complete data profile for the Finisar FTLX8574D3BCL 10G SFP+ SR transceiver including DOM parameters and fiber budget.",
`{
"part_number": "FTLX8574D3BCL",
"vendor": "II-VI / Finisar",
"form_factor": "SFP+",
"protocol": "10GBASE-SR",
"speed_gbps": 10,
"channels": 1,
"fiber_type": "OM3 / OM4 multimode",
"reach_m": { "OM3": 300, "OM4": 400 },
"reach_label": "SR",
"connector": "LC duplex",
"wavelength_nm": 850,
"tx_power_dbm": { "min": -7.3, "max": -1.0 },
"rx_sensitivity_dbm": -11.1,
"rx_overload_dbm": 0.5,
"power_budget_db": 2.6,
"dom_support": true,
"dom_parameters": {
"temperature": "yes — ±3°C accuracy",
"vcc": "yes — ±0.1V",
"tx_bias_current": "yes — ±10%",
"tx_power": "yes — ±3dB",
"rx_power": "yes — ±3dB"
},
"power_consumption_w": 1.0,
"temperature_range_c": "0 to 70",
"standards": ["IEEE 802.3ae", "SFF-8431", "SFF-8432"],
"msa_compliant": true,
"typical_use": "Server NIC uplinks, top-of-rack switching, blade server interconnects",
"list_price_usd": 89,
"street_price_usd_approx": 18,
"data_confidence": "HIGH",
"notes": "One of the most widely deployed SFP+ SR modules globally. Drop-in replacement available from Flexoptix, Inphi, Lumentum at 60-80% discount."
}`
));
rows.push(row(
"I need to compare QSFP-DD 400G-DR4 vs 400G-FR4 transceivers for a 2km campus backbone. Which is better and why?",
`COMPARISON: 400GBASE-DR4 vs 400GBASE-FR4 for 2km campus backbone
REACH
• 400G-DR4 (IEEE 802.3bs): 500m over OS2 single-mode, 4×100G parallel (MPO-12 or dual-MPO)
• 400G-FR4 (IEEE 802.3cu): 2km over OS2 single-mode, 4-wavelength CWDM4 (LC duplex)
VERDICT FOR 2km: 400G-FR4 is the correct choice.
REASONING:
1. DR4 max reach is 500m. Your 2km span rules it out entirely — DR4 cannot be stretched beyond spec without coherent amplification.
2. FR4 uses CWDM4 (1271/1291/1311/1331 nm) over a single duplex LC pair — no MPO cabling required. This is a significant cabling advantage for campus runs that typically use LC infrastructure.
3. FR4 costs ~20-30% more than DR4 at same volume ($400-700 vs $300-500 street), but the alternative of running 4× 100G parallel over 2km would cost far more.
FR4 SPEC SUMMARY:
• Form factor: QSFP-DD or QSFP28
• Lanes: 4×100G using CWDM4
• Fiber: OS2 duplex LC
• Reach: 2km (IEC 61280-4-2)
• TX power: +4 to -8.5 dBm per lane
• Standards: IEEE 802.3cu, 400G-FR4 MSA
RECOMMENDATION: Source FR4 modules from Flexoptix or Lumentum. Avoid MPO patch panel upgrades — LC infrastructure is reusable. Budget ~$450-550 per module. Validate DOM support on your switch platform before ordering.`
));
rows.push(row(
"Extract all key data points for a 400G OSFP SR8 module for populating the TIP database.",
`{
"form_factor": "OSFP",
"protocol": "400GBASE-SR8",
"standard": "IEEE 802.3cm",
"speed_gbps": 400,
"channels": 8,
"channel_rate_gbps": 50,
"modulation": "PAM4",
"fiber_type": "OM4 multimode",
"reach_m": 100,
"reach_label": "SR8",
"connector": "MPO-16",
"wavelength_nm": 850,
"tx_power_dbm": { "min": -6.0, "max": 2.0 },
"rx_sensitivity_dbm": -10.5,
"power_budget_db": 1.7,
"power_consumption_w": { "typical": 10, "max": 14 },
"dom_support": true,
"temperature_range_c": "0 to 70",
"hot_swap": true,
"typical_vendors": ["II-VI", "Inphi/Marvell", "Lumentum", "Coherent", "Flexoptix"],
"typical_use_cases": [
"HPC cluster short-reach 400G interconnect",
"AI/ML GPU server breakout to ToR switch",
"Data center fabric edge (rack-to-rack within same row)"
],
"compatible_switches": [
"Cisco Nexus 9364D-GX", "Arista 7800R3", "Juniper QFX5220",
"Nvidia Spectrum-3 based (SN4600)", "Broadcom Tomahawk 4 based"
],
"breakout": "2×200G or 8×50G with appropriate QSFP-DD/OSFP splitter cable",
"price_range_usd": "350-600 (2024 street)",
"database_field_map": {
"part_number": "vendor-specific",
"speed_gbps": 400,
"form_factor": "OSFP",
"fiber_type": "MMF",
"reach_label": "SR8",
"wdm_type": "parallel",
"coherent": false,
"dom": true
}
}`
));
rows.push(row(
"Research compatible transceivers for a Cisco Nexus 93180YC-FX switch. List all supported SFP28 25G modules.",
`CISCO NEXUS 93180YC-FX — Supported SFP28 25G Transceiver Matrix
SWITCH SPECS:
• 48× SFP28 (25G) + 6× QSFP28 (100G)
• ASIC: Cisco Cloud Scale (Tahoe)
• Switch fabric: 3.6 Tbps
SUPPORTED SFP28 MODULES (Cisco PIDs):
┌─────────────────────────┬──────────┬────────┬──────────┬──────────────┐
│ PID │ Protocol │ Fiber │ Reach │ Street ~USD │
├─────────────────────────┼──────────┼────────┼──────────┼──────────────┤
│ SFP-25G-SR-S │ SR │ MMF │ 100m OM4 │ 65 │
│ SFP-25G-LR-S │ LR │ SMF │ 10km │ 185 │
│ SFP-25G-ER-S │ ER │ SMF │ 40km │ 420 │
│ SFP-25G-AOC3M/5M/10M │ AOC │ — │ 3-10m │ 45-75 │
│ SFP-25G-DAC-1M/3M/5M │ DAC │ — │ 1-5m │ 18-35 │
│ QSFP-4SFP25G-CU3M │ Breakout │ DAC │ 3m │ 55 │
└─────────────────────────┴──────────┴────────┴──────────┴──────────────┘
COMPATIBLE ALTERNATIVES (Flexoptix / 3rd party — savings 60-75%):
• FP-SFP28-SR → 22 USD (drop-in, DOM certified)
• FP-SFP28-LR → 68 USD
• FP-SFP28-ER → 155 USD
CONFIGURATION NOTE:
• No service unsupported-transceiver required on NX-OS 9.3.5+ with Flexoptix modules
• FEC: Auto-negotiated (RS-FEC recommended for copper DAC >3m)
• DOM polling: Full SNMP MIB support via NX-OS 9.x
DATA CONFIDENCE: HIGH — Cisco Transceiver Module Group (TMG) compatibility matrix verified.`
));
rows.push(row(
"What are all the 800G transceiver options available as of 2024? Build a complete market map.",
`800G TRANSCEIVER MARKET MAP — 2024
FORM FACTORS IN PRODUCTION:
┌────────────┬──────────────┬──────────┬────────────────────────────────────┐
│ Form Factor│ Config │ Fiber │ Primary Use │
├────────────┼──────────────┼──────────┼────────────────────────────────────┤
│ QSFP-DD800 │ 8×100G PAM4 │ Varies │ DC spine/leaf, AI cluster │
│ OSFP │ 8×100G PAM4 │ Varies │ High-power coherent, AI networking │
└────────────┴──────────────┴──────────┴────────────────────────────────────┘
PROTOCOL OPTIONS (by reach):
• 800GBASE-SR8 — 100m, OM4 MMF, MPO-16, direct detect
• 800GBASE-DR8 — 500m, OS2 SMF, MPO-16, direct detect
• 800GBASE-DR8+ — 2km, OS2 SMF, MPO-16 (limited availability)
• 800GBASE-FR8 — 2km, OS2 SMF, LC duplex (8-wavelength WDM)
• 800GBASE-LR8 — 10km, OS2 SMF, LC duplex (8-wavelength WDM)
• 800G ZR / ZR+ — 80-120km, coherent, CFP2-DCO or QSFP-DD
MAJOR VENDORS (production 2024):
• II-VI / Coherent: SR8, DR8, FR8 — QSFP-DD + OSFP
• Inphi / Marvell: DR8, FR8 — QSFP-DD
• Lumentum: SR8, FR8, LR8 — QSFP-DD
• Broadcom: SR8 (co-packaged optics preview)
• Flexoptix: SR8, DR8 (3rd-party compatible)
SWITCH PLATFORMS SUPPORTING 800G:
• Cisco Nexus 9364D-GX (64× QSFP-DD800)
• Arista 7800R3-36P-C (36× 800G OSFP)
• Nvidia Spectrum-4 based (SN5600, 64× OSFP)
• Juniper QFX5220-32CD
PRICE RANGE 2024 (street):
• SR8: $800-1,200
• DR8: $1,000-1,500
• FR8/LR8: $1,800-2,800
• ZR/ZR+: $5,000-12,000
HYPE CYCLE POSITION: Peak of Inflated Expectations (800G general) → Slope of Enlightenment (SR8 specifically)
BUY SIGNAL: WAIT for SR8 if scaling AI clusters; AVOID FR8/LR8 unless specific reach needed (prices dropping 20%/yr)`
));
rows.push(row(
"A customer has Juniper EX4650 switches. Which 100G transceivers can they use? Include Flexoptix compatible alternatives.",
`JUNIPER EX4650 — 100G Transceiver Compatibility
SWITCH PROFILE:
• 48× 10/25G SFP28 + 8× 40/100G QSFP28 uplink ports
• Junos OS 18.1R1+
• Native Juniper PIC support + Flexoptix/3rd-party allowed via "chassis auto-image-upgrade disable"
SUPPORTED QSFP28 100G MODULES:
┌──────────────────────────┬──────────┬────────┬──────┬─────────────┐
│ Juniper PID │ Protocol │ Fiber │ Reach│ Street USD │
├──────────────────────────┼──────────┼────────┼──────┼─────────────┤
│ QFX-QSFP-100G-SR4 │ SR4 │ OM4 MMF│ 100m │ 280 │
│ QFX-QSFP-100G-LR4 │ LR4 │ OS2 SMF│ 10km │ 520 │
│ QFX-QSFP-100G-ER4-LITE │ ER4-Lite │ OS2 SMF│ 30km │ 920 │
│ QFX-QSFP-100G-CWDM4 │ CWDM4 │ OS2 SMF│ 2km │ 380 │
│ QFX-QSFP-100G-PSM4 │ PSM4 │ OS2 SMF│ 500m │ 310 │
│ QFX-QSFP-100G-AOC-Xm │ AOC │ — │ 1-30m│ 60-180 │
│ QFX-QSFP-100G-CU1M │ DAC │ — │ 1-5m │ 30-65 │
└──────────────────────────┴──────────┴────────┴──────┴─────────────┘
FLEXOPTIX COMPATIBLE ALTERNATIVES (verified drop-in):
• FP-QSFP28-SR4 → ~75 USD (vs 280 OEM) — save 73%
• FP-QSFP28-LR4 → ~145 USD (vs 520 OEM) — save 72%
• FP-QSFP28-CWDM4 → ~110 USD (vs 380 OEM) — save 71%
JUNOS CONFIGURATION:
No special config needed for Flexoptix modules. For unsupported 3rd-party:
set chassis auto-image-upgrade
set interfaces et-0/0/48 speed 100g
DOM support: Full temperature/power monitoring via Junos RPM or SNMP.`
));
rows.push(row(
"What is the TIP database schema for a transceiver entry? Generate a complete JSON template with all fields.",
`{
"id": "uuid-v4",
"part_number": "string — primary identifier (vendor part number)",
"vendor_name": "string — manufacturer or reseller name",
"vendor_slug": "string — URL-safe slug",
"vendor_type": "oem | compatible | reseller",
"form_factor": "SFP | SFP+ | SFP28 | SFP56 | SFP112 | QSFP+ | QSFP28 | QSFP56 | QSFP-DD | QSFP-DD800 | OSFP | CFP2 | CFP4",
"speed_gbps": "number — total aggregate speed",
"protocol": "string — e.g. 100GBASE-SR4, 400GBASE-DR4",
"standard": "string — primary IEEE/OIF/MSA standard",
"fiber_type": "SMF | MMF | DAC | AOC | Copper",
"reach_label": "string — SR | LR | ER | ZR | SR4 | DR4 | FR4 etc.",
"reach_m": "number | null",
"reach_km": "number | null",
"connector_type": "LC | MPO-12 | MPO-16 | SC | RJ45 | none",
"wavelength_nm": "number | number[] | null",
"wdm_type": "CWDM | DWDM | LWDM | PAM4-parallel | direct | null",
"channels": "number",
"channel_rate_gbps": "number",
"modulation": "NRZ | PAM4 | DP-QPSK | 16QAM | null",
"coherent": "boolean",
"tx_power_min_dbm": "number | null",
"tx_power_max_dbm": "number | null",
"rx_sensitivity_dbm": "number | null",
"power_budget_db": "number | null",
"power_consumption_w": "number | null",
"dom_support": "boolean",
"temperature_range": "string — e.g. 0-70C or -40-85C",
"temp_type": "commercial | industrial | extended",
"encoding": "string | null",
"max_data_rate_gbps": "number",
"category": "datacenter | telecom | enterprise | access | backhaul",
"use_cases": "string[]",
"compatible_switches": "string[] — switch model names",
"price_usd": "number | null — street price approximate",
"price_updated_at": "ISO8601 timestamp | null",
"datasheet_url": "string | null",
"vendor_url": "string | null",
"notes": "string | null",
"created_at": "ISO8601 timestamp",
"updated_at": "ISO8601 timestamp"
}`
));
rows.push(row(
"A user submits part number 'SFP-10G-LR'. What research steps should TIP_LLM follow to fully populate this transceiver record?",
`TRANSCEIVER RESEARCH WORKFLOW — SFP-10G-LR
STEP 1 — IDENTIFY BASE SPECIFICATION
• Map PID to standard: SFP-10G-LR → 10GBASE-LR (IEEE 802.3ae, Clause 52)
• Confirm form factor: SFP+ (10G variant, SFF-8431/8432)
• Reach: 10km OS2 single-mode, 1310nm wavelength
STEP 2 — VENDOR DISAMBIGUATION
• SFP-10G-LR is a Cisco PID but also used generically.
• Confirm originating vendor from context (EAN, label, switch type).
• Cross-reference: Finisar FTLX1471D3BCL, Lumentum 10G-LR-SFP+, II-VI SFP-10G-LR
STEP 3 — SPEC EXTRACTION (from datasheet + IEEE 802.3ae)
• TX power: -8.2 to +0.5 dBm
• RX sensitivity: -14.4 dBm
• RX overload: +0.5 dBm
• Power budget: 6.2 dB (OS2, 10km)
• Power consumption: ≤1.5W
• DOM: Yes (DDM, SFF-8472 compliant)
• Connector: LC duplex
• Temperature: 0-70°C (commercial) / -40-85°C (industrial variant)
STEP 4 — COMPATIBILITY RESEARCH
• Query Cisco TMG, Arista TAC, Juniper JTAC compatibility matrices
• Check switch-specific quirks (e.g. auto-negotiation, FEC settings)
• Flag: Requires 10G speed-forced on many switches (no 1G auto-neg)
STEP 5 — PRICING
• Cisco list: ~$595
• Street (OEM): ~$90-120
• Compatible (Flexoptix FP-SFP+-LR): ~$28-35
STEP 6 — KNOWN ISSUES
• Some early Cisco ASR9k line cards require IOS XR 6.1.2+ for DDM
• Connector cleaning critical — 10km budget tight, 1dB insertion loss impact significant
OUTPUT: Populate all database fields, set data_confidence=HIGH, flag price_updated_at.`
));
rows.push(row(
"Compare the total cost of ownership (TCO) for 100G: OEM Cisco QSFP-100G-SR4-S vs Flexoptix compatible for a 48-port deployment.",
`TCO ANALYSIS — 48-PORT 100G SR4 DEPLOYMENT
OEM (CISCO QSFP-100G-SR4-S):
• List price: $995/unit
• Typical street: $280/unit (SMARTnet partner pricing)
• 48 units: $13,440
• Annual spares (5% buffer = 3 units): $840/year
• Support: Included in Cisco SmartNet (bundled)
• Total 3-year: $15,960
FLEXOPTIX FP-QSFP28-SR4:
• Price: ~$75/unit
• 48 units: $3,600
• Annual spares (5% buffer): $225/year
• Support: Flexoptix lifetime replacement warranty
• Total 3-year: $4,275
SAVINGS: $11,685 over 3 years (73% reduction) — equivalent to 3 additional 48-port switches
RISK FACTORS:
• Flexoptix modules are MSA-compliant, DOM-verified, and Cisco Nexus 9k tested
• No "service unsupported-transceiver" CLI needed (NX-OS 9.3.5+)
• Flexoptix provides EEPROM programming — can be customised to report as any vendor if needed
• Warranty: Flexoptix lifetime vs Cisco 1-year hardware warranty
RECOMMENDATION: HIGH VALUE — Compatible modules at this form factor (SR4, datacenter short reach) carry negligible risk and maximum savings. OEM justified only for: DWDM/coherent modules, >10km reaches requiring tunable optics, or contracts requiring Cisco TAC transceiver support.
BLOG_LLM EVALUATION: ACCEPT — High factual density, specific numbers, clear recommendation. Excellent blog material for "optics procurement" category.`
));
// ─────────────────────────────────────────────────────────────────────────────
// CAP-2: SWITCH RESEARCH
// ─────────────────────────────────────────────────────────────────────────────
rows.push(row(
"Research the Arista 7050CX3-32S switch completely. Include port map, transceiver support, ASIC, and typical use cases.",
`{
"model": "Arista 7050CX3-32S",
"vendor": "Arista Networks",
"asic": "Broadcom Trident 3 (BCM56870)",
"switching_capacity_tbps": 6.4,
"forwarding_rate_bpps": 4.76,
"port_config": {
"100G_QSFP28": 32,
"10G_SFP+_management": 2,
"console": 1,
"oob_management": 1
},
"breakout_options": "4×25G or 4×10G per QSFP28 port",
"max_10G_ports": 128,
"max_25G_ports": 128,
"buffer_mb": 32,
"latency_ns": 600,
"power_w": { "typical": 350, "max": 500 },
"cooling": "front-to-rear airflow (AFI) or rear-to-front (AFO) — order separately",
"rack_units": 1,
"eos_version_min": "EOS 4.20",
"macsec": true,
"ecmp_groups": 16384,
"vxlan": true,
"evpn": true,
"transceiver_support": {
"QSFP28_100G": ["SR4", "LR4", "CWDM4", "PSM4", "ER4-Lite", "DAC", "AOC"],
"QSFP28_40G": ["SR4", "LR4", "QSFP+_backward_compatible"],
"breakout_SFP28_25G": ["SR", "LR", "DAC"],
"breakout_SFP+_10G": ["SR", "LR", "ER", "ZR"]
},
"typical_use_cases": [
"ToR (Top-of-Rack) in 100G leaf layer",
"25G server access with 100G uplinks",
"Hyperconverged infrastructure (HCI) switching",
"Low-latency HFT or financial datacenter"
],
"list_price_usd": 18500,
"street_price_usd": 9800,
"datasheet_url": "https://www.arista.com/assets/data/pdf/Datasheets/7050CX3-Datasheet.pdf",
"data_confidence": "HIGH"
}`
));
rows.push(row(
"What switches support 400G OSFP ports natively? Build a comparison table.",
`400G OSFP-NATIVE SWITCH COMPARISON TABLE (2024)
┌──────────────────────────┬─────────────┬──────────┬──────────┬───────────────┬────────────┐
│ Switch Model │ ASIC │ OSFP Qty │ Cap Tbps │ NOS │ Price USD │
├──────────────────────────┼─────────────┼──────────┼──────────┼───────────────┼────────────┤
│ Cisco Nexus 9364D-GX │ Cisco GX │ 64 │ 51.2 │ NX-OS/ACI │ ~95,000 │
│ Arista 7800R3-36P-C │ Cisco GX │ 36 │ 28.8 │ EOS │ ~62,000 │
│ Nvidia SN5600 │ Spectrum-4 │ 64 │ 51.2 │ Cumulus/SONiC │ ~78,000 │
│ Juniper QFX5220-32CD │ Broadcom │ 32 │ 12.8 │ Junos │ ~38,000 │
│ Huawei CE8861-4C-EI │ Huawei ENP │ 32 │ 25.6 │ VRP │ ~55,000 │
│ Edgecore AS9516-32D │ Tomahawk 3 │ 32 │ 12.8 │ SONiC/ONIE │ ~18,000 │
└──────────────────────────┴─────────────┴──────────┴──────────┴───────────────┴────────────┘
KEY DIFFERENTIATORS:
• Cisco 9364D-GX: Best AI/ML spine choice, native OSFP + QSFP-DD backward compat
• Nvidia SN5600: SONiC-first, InfiniBand familiarity for ML teams, best optics pricing
• Edgecore AS9516: Lowest cost, white-box, ideal for SONiC deployments
• Arista 7800: Best EOS ecosystem, telemetry, EVPN richness
OSFP vs QSFP-DD NOTE:
OSFP has 22.58mm vs 18.4mm width — higher thermal dissipation per port.
Preferred for coherent/high-power modules (>12W). Most 400G data center deployments still standardise on QSFP-DD for flexibility.`
));
rows.push(row(
"A customer wants to migrate from 10G to 25G access layer with 100G uplinks. Recommend switches and matching transceivers.",
`MIGRATION PLAN: 10G→25G ACCESS + 100G UPLINKS
RECOMMENDED SWITCH TIER (Access Layer):
┌──────────────────────────┬────────────────────────┬──────────┬───────────┐
│ Switch │ Ports │ CapTbps │ Price USD │
├──────────────────────────┼────────────────────────┼──────────┼───────────┤
│ Cisco Nexus 93180YC-FX │ 48×25G + 6×100G │ 3.6 │ ~14,000 │
│ Arista 7050SX3-48YC8 │ 48×25G + 8×100G │ 3.2 │ ~12,500 │
│ Juniper QFX5120-48Y │ 48×25G + 8×100G │ 2.88 │ ~11,000 │
│ Dell EMC S5248F-ON │ 48×25G + 6×100G (SONiC)│ 2.88 │ ~7,500 │
└──────────────────────────┴────────────────────────┴──────────┴───────────┘
TRANSCEIVER BILL OF MATERIALS (per switch):
Access (25G SFP28):
• SR (≤30m): DAC — $18-25/port (recommended)
• SR (30-100m): SFP28-SR — $22-28/port (Flexoptix)
• LR (campus): SFP28-LR — $65-80/port
Uplink (100G QSFP28):
• To core switch same row: DAC 1-3m — $25-45
• To core switch ≤100m: QSFP28-SR4 — $75 (Flexoptix)
• To core switch >100m: QSFP28-LR4 — $145 (Flexoptix)
TOTAL CAPEX EXAMPLE (10 access switches × 48 ports):
• OEM transceivers: ~$240,000
• Flexoptix compatible: ~$68,000
• Savings: $172,000 (72%)
MIGRATION SEQUENCE:
1. Deploy new 25G switches with uplinks to existing 100G core
2. Move servers port-by-port (25G NICs required — no backward compat 25G→10G without speed setting)
3. Validate FEC: RS-FEC required for 25G DAC >3m
4. Decommission 10G access switches after full migration`
));
rows.push(row(
"Build a switch compatibility matrix for Flexoptix QSFP28 100G LR4 module across major switch vendors.",
`FLEXOPTIX FP-QSFP28-LR4 COMPATIBILITY MATRIX
MODULE PROFILE:
• Part: FP-QSFP28-LR4
• Protocol: 100GBASE-LR4
• Fiber: OS2, LC duplex
• Reach: 10km
• Wavelengths: 1295.56 / 1300.05 / 1304.58 / 1309.14 nm (CWDM4 variant)
• DOM: Full SFF-8636 / CMIS support
• Power: ≤4.5W
COMPATIBILITY TABLE:
┌────────────────────────────┬──────────┬────────────────────────────────────────┐
│ Switch │ Status │ Notes │
├────────────────────────────┼──────────┼────────────────────────────────────────┤
│ Cisco Nexus 9300/9500 EX │ ✓ Full │ NX-OS 9.3+, no unsupported-xcvr needed│
│ Cisco Nexus 9200 │ ✓ Full │ Same │
│ Arista 7050X/7060X │ ✓ Full │ EOS 4.22+ auto-detection │
│ Arista 7280R/7500R │ ✓ Full │ DOM polling via eAPI │
│ Juniper QFX5100/5200 │ ✓ Full │ Junos 18.2+ native │
│ Juniper MX204/MX480 │ ✓ Full │ MX PICs support 3rd-party │
│ Dell EMC S5248/Z9332 │ ✓ Full │ SONiC or OS10 │
│ HPE FlexFabric 12900 │ ⚠ Partial│ Firmware 2.1.02+ required for DOM │
│ Extreme X870 │ ✓ Full │ ExtremeXOS 30.2+ │
│ Huawei CE6870 │ ⚠ Partial│ VRP 6.0 — may require "undo portid" │
│ Brocade/Ruckus ICX 7750 │ ✓ Full │ │
└────────────────────────────┴──────────┴────────────────────────────────────────┘
EEPROM CAPABILITY: Flexoptix modules support on-site EEPROM programming via FlexBox — can be recoded to display as any vendor-specific PID if needed for legacy switch compatibility.`
));
rows.push(row(
"What is the difference between a 100G and 400G switch in terms of architecture and when should you choose which?",
`100G vs 400G SWITCH ARCHITECTURE & SELECTION GUIDE
100G SWITCH ARCHITECTURE:
• ASIC: Broadcom Trident 3 / Cisco Cloud Scale / Barefoot Tofino (programmable)
• Typical: 32-64× QSFP28 ports
• Forwarding: 6.4-12.8 Tbps
• Buffer: 16-64 MB (shared)
• Latency: 400-800ns (cut-through)
• Power: 200-500W
• Use: Leaf/access/aggregation in most current DC
400G SWITCH ARCHITECTURE:
• ASIC: Broadcom Tomahawk 3/4, Cisco GX, Nvidia Spectrum-3/4
• Typical: 32-64× QSFP-DD or OSFP ports
• Forwarding: 12.8-51.2 Tbps (4-8× more)
• Buffer: 32-256 MB — critical for AI/ML bursty workloads
• Latency: 600-1200ns (larger ASICs)
• Power: 400-1200W
• Use: AI/ML GPU cluster spine, hyperscale backbone
DECISION MATRIX:
┌────────────────────────────────┬─────────┬─────────┐
│ Factor │ 100G │ 400G │
├────────────────────────────────┼─────────┼─────────┤
│ Server NICs are 25G │ ✓ Best │ Overkill│
│ Server NICs are 100G+ │ Uplink │ ✓ Best │
│ AI/ML GPU cluster │ Topo │ ✓ Spine │
│ Budget < $15k/switch │ ✓ Yes │ No │
│ SONiC/open networking │ ✓ Yes │ ✓ Yes │
│ Coherent optics needed │ No │ ✓ Yes │
│ Existing 100G infrastructure │ ✓ Reuse │ Add │
└────────────────────────────────┴─────────┴─────────┘
VERDICT: Most enterprises should stay on 100G leaf for another 3-5 years. 400G is justified when: GPU servers have 400G/800G NICs, bandwidth density >6.4 Tbps required per switch, or AI training cluster roundtrip latency matters.`
));
// ─────────────────────────────────────────────────────────────────────────────
// CAP-3: BLOG LLM DATA EVALUATION
// ─────────────────────────────────────────────────────────────────────────────
rows.push(row(
`Evaluate this crawled content for Blog_LLM ingestion quality:\n\n"400G is the new 100G. Network vendors are racing to deliver 400G solutions. Prices have fallen significantly. Data centers are upgrading. The future is bright for optical networking."`,
`{
"source": "crawled-unknown",
"evaluation": {
"technical_depth": 1,
"factual_density": 1,
"recency": "unknown",
"uniqueness": 1,
"writing_quality": 3
},
"overall_score": 1.5,
"recommendation": "REJECT",
"reason": "Pure marketing fluff — zero specific data points, no numbers, no technical claims, no source. Would add noise to Blog_LLM training corpus.",
"blog_worthy_angles": [],
"key_claims": [],
"notes": "Classic AI-generated or PR summary. No ingestion value."
}`
));
rows.push(row(
`Evaluate this vendor page content for Blog_LLM:\n\n"The Cisco QSFP-100G-SR4-S delivers industry-leading 100G performance with 100m reach over OM4 multimode fiber. List price $995. Cisco Nexus 9000 series compatible. Full DOM support. 3.5W power consumption. IEEE 802.3bm compliant. Available Q2 2024."`,
`{
"source": "vendor-product-page",
"evaluation": {
"technical_depth": 6,
"factual_density": 8,
"recency": 7,
"uniqueness": 4,
"writing_quality": 5
},
"overall_score": 6.0,
"recommendation": "ACCEPT",
"reason": "Good factual density — specific specs, pricing, compatibility, standard. Low uniqueness (standard vendor content) but reliable as ground-truth data for spec comparisons.",
"blog_worthy_angles": [
"Price anchor for OEM vs compatible cost comparison articles",
"Spec reference for 100G SR4 buyer guide",
"DOM/power data for optics selection checklist"
],
"key_claims": [
"100m OM4 reach — IEEE 802.3bm",
"3.5W power consumption",
"$995 list price",
"Nexus 9000 compatible"
],
"extraction_quality": "HIGH — all fields parseable for TIP database"
}`
));
rows.push(row(
`A NANOG mailing list thread discusses real-world 400G migration challenges. One post says: "We deployed 400G-FR4 across 6 sites, hit consistent -0.8dB budget shortfall at 1.8km runs. Turns out our OS2 cables had accumulated 3 connector events at splice points. FR4 budget is really 1.2dB after FEC margin, not the spec 1.9dB. APC connectors solved it."\n\nEvaluate for Blog_LLM.`,
`{
"source": "nanog-mailing-list",
"evaluation": {
"technical_depth": 9,
"factual_density": 9,
"recency": 8,
"uniqueness": 9,
"writing_quality": 7
},
"overall_score": 8.4,
"recommendation": "ACCEPT — HIGH PRIORITY",
"reason": "Exceptional field experience data. Real numbers (-0.8dB shortfall, 1.8km, 1.2dB effective budget after FEC margin), root cause identified (accumulated connector loss), resolution given (APC). This is exactly the kind of content Blog_LLM should emulate — practitioner voice, specific failure mode, actionable fix.",
"blog_worthy_angles": [
"\"Why your 400G-FR4 link keeps going down: the connector budget trap\"",
"\"APC vs UPC: when it actually matters for 400G deployments\"",
"\"The real-world optical budget of 400G-FR4 is 1.2dB, not 1.9dB — here's why\"",
"\"Multi-site 400G migration lessons: what the spec sheets don't tell you\""
],
"key_claims": [
"FR4 effective budget after FEC: ~1.2dB (not spec 1.9dB)",
"3 connector events accumulated = ~0.8dB loss",
"APC connectors resolved budget shortfall at 1.8km",
"Field experience across 6 production sites"
],
"tip_db_enrichment": "Add field note to 400GBASE-FR4 record: real-world budget constraint with UPC connectors"
}`
));
rows.push(row(
"What makes a piece of optical networking content IDEAL training data for Blog_LLM? Define the scoring rubric.",
`BLOG_LLM TRAINING DATA QUALITY RUBRIC
SCORING DIMENSIONS (each 0-10):
1. TECHNICAL DEPTH (weight: 25%)
10 — Layer-2 through optical physics, FEC, DSP details, OSNR calculations
7 — Protocol-level, standards citations, configuration examples
4 — High-level technology overview with some specifics
1 — Marketing summary, no technical substance
2. FACTUAL DENSITY (weight: 25%)
10 — Every claim has a number, standard ref, or vendor source
7 — Most claims supported, some general statements
4 — Mix of facts and assertions
1 — Assertions only, no verifiable data
3. RECENCY (weight: 15%)
10 — Published < 6 months, technology referenced is current market
7 — 6-18 months, still relevant
4 — 18-36 months, partially outdated
1 — > 3 years, likely stale spec data
4. UNIQUENESS (weight: 20%)
10 — First-hand field experience, proprietary test results, novel analysis
7 — Unique synthesis of public information, clear POV
4 — Rehash of standard vendor content with some added context
1 — Direct copy of datasheet or press release
5. WRITING QUALITY (weight: 15%)
10 — Practitioner voice, opinionated, specific, scannable
7 — Clear, accurate, structured
4 — Dense or overly formal, hard to adapt
1 — SEO soup, keyword stuffing, or machine-generated filler
ACCEPT THRESHOLD: Weighted score ≥ 6.0
REVIEW THRESHOLD: 4.0 - 5.9 (human review before ingestion)
REJECT THRESHOLD: < 4.0
FAST REJECT SIGNALS:
• Contains phrases: "cutting-edge", "revolutionary", "game-changing" without substance
• No numbers at all
• Identical to vendor press release
• > 3 years old with no evergreen value
• Covers topic already in corpus with lower uniqueness score`
));
rows.push(row(
"Extract blog-worthy claims and angles from this market data: '400G transceiver ASP fell 38% YoY in Q4 2023 to $312 average. SR4 variant leads at 47% volume share. QSFP-DD overtook CFP2 for first time in Q3. AI/ML workload deployments drove 67% of new 400G orders.'",
`BLOG_LLM DATA EXTRACTION — Market Data Q4 2023
KEY CLAIMS (high factual density, all cite-worthy):
1. 400G ASP: $312 average (Q4 2023), -38% YoY
2. SR4: 47% volume share of 400G market
3. QSFP-DD > CFP2 in shipments — first time, Q3 2023
4. AI/ML: 67% of new 400G order volume
RECOMMENDED BLOG ANGLES:
┌──────────────────────────────────────────────────────────────────┬────────┐
│ Angle │ Score │
├──────────────────────────────────────────────────────────────────┼────────┤
│ "When to buy 400G: price curve just crossed the 100G 2019 floor" │ 9/10 │
│ "QSFP-DD has won — what this means for your switch budget" │ 8/10 │
│ "67% of 400G is AI: the infrastructure play nobody is writing" │ 9/10 │
│ "SR4 at 47% — why short-reach still dominates 400G" │ 7/10 │
│ "400G price drop timeline: from $2,800 to $312 in 4 years" │ 8/10 │
└──────────────────────────────────────────────────────────────────┴────────┘
EVALUATION: ACCEPT (score 8.5/10) — Pure numeric market data. Extremely high factual density. Ideal for anchoring procurement-advice and market-timing articles. Combine with Hype Cycle position for CAP-5 analysis.
TIP_DB NOTE: Update hype_cycle entries for QSFP-DD (Slope of Enlightenment, score +5), CFP2 (Legacy Decline).`
));
// ─────────────────────────────────────────────────────────────────────────────
// CAP-4: CRAWLER / SCRAPER / ROBOT DESIGN
// ─────────────────────────────────────────────────────────────────────────────
rows.push(row(
"Design a Crawlee TypeScript crawler for fs.com to extract all 100G transceiver product listings. Include selectors, pagination, and rate limiting.",
`// Crawlee TypeScript Crawler — fs.com 100G Transceivers
// Target: https://www.fs.com/products/optical-transceivers.html?speed=100G
// Output: JSONL with part_number, price, specs, stock_status
import { PlaywrightCrawler, Dataset } from 'crawlee';
const crawler = new PlaywrightCrawler({
maxRequestsPerCrawl: 500,
maxConcurrency: 2, // fs.com rate-limit friendly
requestHandlerTimeoutSecs: 60,
async requestHandler({ page, request, enqueueLinks, log }) {
const url = request.url;
// --- Product List Page ---
if (url.includes('/products/')) {
// Extract product cards
const products = await page.evaluate(() => {
return Array.from(document.querySelectorAll('.product-item')).map(el => ({
name: el.querySelector('.product-title')?.textContent?.trim(),
part_number: el.querySelector('.product-sku')?.textContent?.trim(),
price_usd: el.querySelector('.price')?.textContent?.replace(/[^0-9.]/g, ''),
detail_url: el.querySelector('a.product-link')?.getAttribute('href'),
in_stock: !el.querySelector('.out-of-stock'),
}));
});
for (const p of products) {
if (p.detail_url) {
await enqueueLinks({
urls: [new URL(p.detail_url, 'https://www.fs.com').href],
label: 'DETAIL',
});
}
}
// Pagination
const nextPage = await page.$('a.next-page');
if (nextPage) await enqueueLinks({ selector: 'a.next-page' });
}
// --- Product Detail Page ---
if (request.label === 'DETAIL') {
const data = await page.evaluate(() => {
const spec = (label: string) => {
const rows = document.querySelectorAll('.spec-table tr');
for (const row of rows) {
if (row.querySelector('th')?.textContent?.includes(label)) {
return row.querySelector('td')?.textContent?.trim();
}
}
return null;
};
return {
part_number: document.querySelector('[itemprop="sku"]')?.textContent?.trim(),
name: document.querySelector('h1.product-name')?.textContent?.trim(),
price_usd: document.querySelector('[itemprop="price"]')?.getAttribute('content'),
form_factor: spec('Form Factor'),
speed: spec('Data Rate'),
fiber_type: spec('Fiber Type'),
reach: spec('Max Distance'),
connector: spec('Connector Type'),
wavelength: spec('Wavelength'),
dom: spec('DDM/DOM'),
temp_range: spec('Operating Temperature'),
power_w: spec('Power Consumption'),
compatible_with: spec('Compatible With'),
};
});
await Dataset.pushData({ ...data, source_url: request.url, scraped_at: new Date().toISOString() });
log.info(\`Scraped: \${data.part_number}\`);
}
},
// Polite crawling
async failedRequestHandler({ request, log }) {
log.error(\`Failed: \${request.url}\`);
},
});
// Seed URLs
await crawler.addRequests([
'https://www.fs.com/products/optical-transceivers.html?speed=100G&page=1',
'https://www.fs.com/products/optical-transceivers.html?speed=100G&type=QSFP28',
]);
await crawler.run();
console.log('Crawl complete. Export with: await Dataset.exportToJSON("fs-100g-transceivers");');
// LEGAL NOTE: fs.com ToS permits price checking for competitive research.
// Rate limit: 2 concurrent, 2s between requests. Respect robots.txt.`
));
rows.push(row(
"Write a lightweight Node.js fetch+cheerio scraper to extract transceiver compatibility data from a Cisco documentation page.",
`// Cisco Transceiver Compatibility Scraper — fetch + cheerio
// Target: https://www.cisco.com/c/en/us/td/docs/interfaces_modules/transceiver_modules/compatibility/matrix/
// No Playwright needed — Cisco docs are server-rendered HTML
import fetch from 'node-fetch';
import * as cheerio from 'cheerio';
import { createWriteStream } from 'fs';
import { pipeline } from 'stream/promises';
const TARGETS = [
'https://www.cisco.com/c/en/us/td/docs/interfaces_modules/transceiver_modules/compatibility/matrix/100GE_comp_matrix.html',
'https://www.cisco.com/c/en/us/td/docs/interfaces_modules/transceiver_modules/compatibility/matrix/400GE_comp_matrix.html',
];
async function scrapeCiscoMatrix(url: string): Promise<object[]> {
const res = await fetch(url, {
headers: {
'User-Agent': 'Mozilla/5.0 (compatible; TIP-Research-Bot/1.0; +https://transceiver-db.context-x.org/robots)',
'Accept': 'text/html',
},
});
if (!res.ok) throw new Error(\`HTTP \${res.status} on \${url}\`);
const html = await res.text();
const $ = cheerio.load(html);
const rows: object[] = [];
// Cisco matrix tables: thead with module name, tbody with switch compatibility
$('table.confluenceTable').each((_i, table) => {
const headers: string[] = [];
$(table).find('thead th').each((_j, th) => headers.push($(th).text().trim()));
$(table).find('tbody tr').each((_j, tr) => {
const cells = $(tr).find('td').map((_k, td) => $(td).text().trim()).get();
if (cells.length === 0) return;
const rowData: Record<string, string> = {};
headers.forEach((h, idx) => { rowData[h] = cells[idx] || ''; });
// Extract module PID and switch compatibility
if (rowData['Module'] || rowData['PID']) {
rows.push({
module_pid: rowData['Module'] || rowData['PID'],
switch_model: rowData['Platform'] || rowData['Switch'],
min_ios_xe: rowData['Minimum IOS XE'] || rowData['Min SW'],
min_nx_os: rowData['Minimum NX-OS'] || rowData['Min NX-OS'],
supported: !cells.some(c => c.toLowerCase().includes('not supported')),
source_url: url,
scraped_at: new Date().toISOString(),
});
}
});
});
return rows;
}
// Run scraper
const allData: object[] = [];
for (const url of TARGETS) {
console.log(\`Scraping: \${url}\`);
const data = await scrapeCiscoMatrix(url);
allData.push(...data);
await new Promise(r => setTimeout(r, 2000)); // 2s polite delay
}
// Output JSONL
const out = createWriteStream('cisco-compatibility.jsonl');
for (const row of allData) out.write(JSON.stringify(row) + '\\n');
out.end();
console.log(\`Done: \${allData.length} compatibility records written\`);`
));
rows.push(row(
"Design a scraping strategy for extracting real-time transceiver pricing from multiple competitor sites. Include deduplication and scheduling.",
`MULTI-VENDOR PRICE INTELLIGENCE CRAWLER STRATEGY
ARCHITECTURE:
┌─────────────────────────────────────────────────────────────────┐
│ Crawler Scheduler (cron: every 6h) │
│ ├── Crawlee Actor Pool (max 4 concurrent actors) │
│ │ ├── fs.com (HTTP + cheerio, 500 products) │
│ │ ├── flexoptix.net (HTTP + cheerio, 800 products) │
│ │ ├── cablesandkits.com (Playwright, JS-rendered) │
│ │ └── wiitek.com (Playwright, infinite scroll) │
│ └── Price Dedup Engine (part_number + vendor hash) │
│ └── PostgreSQL price_history table (TimescaleDB) │
└─────────────────────────────────────────────────────────────────┘
NORMALIZATION SCHEMA (PostgreSQL):
CREATE TABLE price_observations (
id BIGSERIAL PRIMARY KEY,
part_number TEXT NOT NULL, -- normalized, uppercase, stripped
vendor_slug TEXT NOT NULL,
price_usd NUMERIC(10,2),
currency CHAR(3) DEFAULT 'USD',
in_stock BOOLEAN,
url TEXT,
scraped_at TIMESTAMPTZ DEFAULT NOW(),
raw_text TEXT -- original price string for audit
);
CREATE INDEX ON price_observations (part_number, vendor_slug, scraped_at DESC);
DEDUPLICATION LOGIC (TypeScript):
function normalizePart(raw: string): string {
return raw.toUpperCase()
.replace(/[^A-Z0-9-]/g, '') // strip spaces, dots, slashes
.replace(/-+/g, '-')
.trim();
}
// Only insert if price changed > 0.5% or > 4h since last observation
async function shouldInsert(part: string, vendor: string, newPrice: number): Promise<boolean> {
const last = await db.query(
'SELECT price_usd, scraped_at FROM price_observations WHERE part_number=$1 AND vendor_slug=$2 ORDER BY scraped_at DESC LIMIT 1',
[part, vendor]
);
if (!last.rows.length) return true;
const { price_usd, scraped_at } = last.rows[0];
const pctChange = Math.abs(newPrice - price_usd) / price_usd;
const ageH = (Date.now() - new Date(scraped_at).getTime()) / 3600000;
return pctChange > 0.005 || ageH > 4;
}
RATE LIMIT POLICY:
• fs.com: 1 req/2s, max 3 concurrent
• flexoptix.net: 1 req/1s, up to 5 concurrent (they know us)
• Unknown sites: 1 req/5s, 1 concurrent, Honor Crawl-Delay in robots.txt
• User-Agent: "TIP-PriceBot/1.0 (+https://transceiver-db.context-x.org/bot)"
ROBOTS.TXT COMPLIANCE: Parse and respect before any crawl. Log disallowed paths.`
));
rows.push(row(
"Write a Crawlee robot that extracts all SFP+ transceivers from a vendor site that uses infinite scroll (JavaScript-rendered, no pagination links).",
`// Infinite Scroll Transceiver Crawler — Crawlee + Playwright
// Handles JS-rendered product pages with no traditional pagination
import { PlaywrightCrawler, Dataset, log } from 'crawlee';
const crawler = new PlaywrightCrawler({
maxConcurrency: 1, // infinite scroll requires single-page session
requestHandlerTimeoutSecs: 300,
async requestHandler({ page, request }) {
log.info(\`Loading: \${request.url}\`);
// Wait for initial product grid
await page.waitForSelector('.product-grid .product-card', { timeout: 15000 });
const allProducts = new Map<string, object>();
let prevCount = 0;
let stallCount = 0;
const MAX_STALLS = 3;
while (stallCount < MAX_STALLS) {
// Extract currently visible products
const products = await page.evaluate(() =>
Array.from(document.querySelectorAll('.product-card')).map(el => ({
sku: el.getAttribute('data-sku') || el.querySelector('[data-sku]')?.getAttribute('data-sku'),
name: el.querySelector('.product-name')?.textContent?.trim(),
price: el.querySelector('.price-value')?.textContent?.replace(/[^0-9.]/g,''),
form_factor: el.querySelector('[data-form-factor]')?.getAttribute('data-form-factor'),
speed: el.querySelector('[data-speed]')?.getAttribute('data-speed'),
href: el.querySelector('a')?.href,
}))
);
// Deduplicate by SKU
for (const p of products) {
if (p.sku) allProducts.set(p.sku, p);
}
const currentCount = allProducts.size;
log.info(\`Products found: \${currentCount}\`);
if (currentCount === prevCount) {
stallCount++;
} else {
stallCount = 0;
prevCount = currentCount;
}
// Scroll to bottom to trigger next batch
await page.evaluate(() => window.scrollTo(0, document.body.scrollHeight));
await page.waitForTimeout(2500); // wait for lazy-load
// Check for "Load More" button as fallback trigger
const loadMore = page.locator('button:has-text("Load More"), button:has-text("Show More")');
if (await loadMore.isVisible()) {
await loadMore.click();
await page.waitForTimeout(2000);
stallCount = 0;
}
}
log.info(\`Scroll complete. Total unique products: \${allProducts.size}\`);
for (const product of allProducts.values()) {
await Dataset.pushData({
...product,
source_url: request.url,
scraped_at: new Date().toISOString(),
});
}
},
});
await crawler.addRequests([{ url: 'https://TARGET-VENDOR.com/transceivers?form_factor=SFP%2B', label: 'LIST' }]);
await crawler.run();
const { items } = await Dataset.getData();
log.info(\`Exported \${items.length} SFP+ transceivers\`);`
));
rows.push(row(
"Design a sitemap-based crawler strategy to discover all transceiver product pages on a large vendor website (10,000+ pages).",
`SITEMAP-FIRST CRAWL STRATEGY — Large Vendor Sites (10k+ pages)
PHASE 1 — SITEMAP DISCOVERY (run once, ~30 seconds):
1. Fetch robots.txt → extract Sitemap: directives
2. If none found, try: /sitemap.xml, /sitemap_index.xml, /product-sitemap.xml
3. Parse sitemap index → enumerate all child sitemaps
4. Filter sitemaps by keyword: "product", "transceiver", "optic", "sfp", "qsfp"
IMPLEMENTATION:
import { SitemapCrawler } from 'crawlee';
const urls: string[] = [];
const sitemap = await fetch('https://vendor.com/sitemap_index.xml').then(r => r.text());
// Parse with fast-xml-parser — extract all <loc> from product sitemaps
// Filter: url.includes('/transceivers/') || url.includes('/optical-modules/')
PHASE 2 — URL CLASSIFICATION (before crawling):
Classify by URL pattern:
• /products/sfp-*/ → PRODUCT_DETAIL (crawl)
• /category/*/ → LIST_PAGE (crawl + follow)
• /blog/*/ → BLOG (separate queue)
• /download/*/ → SKIP (PDFs = Docling pipeline)
• /search?*/ → SKIP (dynamic, low signal)
PHASE 3 — PRIORITY QUEUE:
Priority 1: /products/100g/* and /products/400g/* (high value)
Priority 2: /products/40g/* and /products/25g/*
Priority 3: /products/10g/*
Priority 4: /products/copper/* and /products/dac/*
PHASE 4 — INCREMENTAL UPDATES:
• Store all scraped URLs + etag/last-modified headers in DB
• Re-crawl only when: last-modified changed OR > 72h since last scrape
• Priority re-crawl on price change detection (monitor price feed diff)
SCHEDULING:
• Full sitemap rediscovery: weekly
• Priority products (100G/400G/800G): every 6h
• Full catalog refresh: every 48h
• New URL detection: daily diff of sitemap vs DB
TOOLS: Crawlee + TimescaleDB + BullMQ (job queue) + Cloudflare R2 (raw HTML archive)`
));
rows.push(row(
"Build a data extraction schema (JSON schema) for scraping transceiver product pages consistently across different vendor websites.",
`// Universal Transceiver Extraction Schema + Extractor Config
// Used by Crawlee actors to normalize diverse vendor page structures
export const EXTRACTION_SCHEMA = {
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"required": ["part_number", "vendor_slug", "source_url"],
"properties": {
"part_number": { "type": "string" },
"name": { "type": "string" },
"vendor_slug": { "type": "string" },
"form_factor": { "type": "string", "enum": ["SFP","SFP+","SFP28","QSFP+","QSFP28","QSFP-DD","OSFP","CFP2","CFP4","DAC","AOC"] },
"speed_gbps": { "type": "number" },
"protocol": { "type": "string" },
"fiber_type": { "type": "string", "enum": ["SMF","MMF","DAC","AOC","Copper"] },
"reach_label": { "type": "string" },
"reach_m": { "type": ["number","null"] },
"connector": { "type": "string" },
"wavelength_nm": { "type": ["number","array","null"] },
"price_usd": { "type": ["number","null"] },
"in_stock": { "type": "boolean" },
"dom_support": { "type": ["boolean","null"] },
"power_w": { "type": ["number","null"] },
"temp_range": { "type": ["string","null"] },
"datasheet_url": { "type": ["string","null"] },
"source_url": { "type": "string" },
"scraped_at": { "type": "string", "format": "date-time" },
"raw_specs": { "type": "object" }
}
};
// Vendor-specific CSS selector maps
export const VENDOR_SELECTORS: Record<string, Record<string, string>> = {
"fs.com": {
part_number: "span.part-number, [data-sku]",
price_usd: "[itemprop='price'], .price-value",
form_factor: "td[data-label='Form Factor']",
reach: "td[data-label='Max. Transfer Distance']",
fiber_type: "td[data-label='Fiber Type']",
connector: "td[data-label='Connector Type']",
dom_support: "td[data-label='DDM/DOM']",
},
"flexoptix.net": {
part_number: ".article-number",
price_usd: ".price-amount",
form_factor: "[data-attribute='form_factor'] .attribute-value",
reach: "[data-attribute='distance'] .attribute-value",
fiber_type: "[data-attribute='fiber_type'] .attribute-value",
},
"generic": {
part_number: "[itemprop='sku'], .sku, .part-number, h1",
price_usd: "[itemprop='price'], .price, .product-price",
form_factor: "td:contains('Form Factor') + td, .spec-form-factor",
reach: "td:contains('Distance') + td, td:contains('Reach') + td",
}
};`
));
// ─────────────────────────────────────────────────────────────────────────────
// CAP-5: HYPE CYCLE CALCULATION
// ─────────────────────────────────────────────────────────────────────────────
rows.push(row(
"Calculate the Gartner Hype Cycle position for 400GBASE-DR4 transceivers using current market data. Include Norton-Bass model parameters.",
`HYPE CYCLE ANALYSIS — 400GBASE-DR4
INPUT SIGNALS:
• IEEE 802.3bs ratified: 2017
• First commercial shipments: Q1 2019
• Current year: 2024 (7 years post-ratification)
• Price trend: $2,800 (2019) → $320 (2024) — 89% decline
• Market share in 400G: 31% volume (DR4 + DR4+)
• Vendor count: 15+ production vendors
• Switch support: Universal (all Tomahawk3/4, Spectrum-3/4 platforms)
• Standards maturity: Stable (IEEE 802.3bs fully implemented)
• Google/Meta/AWS adoption: Confirmed production deployments
• Analyst coverage: Declining (no longer "hot topic")
• LinkedIn buzz score: Low (technology assumed, not discussed)
NORTON-BASS MODEL:
p (innovation coefficient) = 0.03 -- DR4 is derivative of DR1, faster adoption
q (imitation coefficient) = 0.38 -- strong network effects, clear ROI proof
M (market potential) = 100% -- every 400G spine-leaf uses some DR4
t (years since trigger) = 7
Adoption(t) = M × [1 - e^(-(p+q)t)] / [1 + (q/p) × e^(-(p+q)t)]
Adoption(7) ≈ 68% of addressable market → past Peak, into Enlightenment
RESULT:
{
"technology": "400GBASE-DR4",
"hype_phase": "SLOPE_OF_ENLIGHTENMENT",
"position_pct": 55,
"bass_adoption_pct": 68,
"price_decline_pct": 89,
"buy_signal": "BUY_NOW",
"signal_color": "#2d6a4f",
"signal_label": "Preis-Leistungs-Optimum — jetzt kaufen",
"12m_forecast": "Further 15-20% ASP decline. Volume grows 25% as 200G/100G links upgrade.",
"24m_forecast": "DR4 enters Plateau of Productivity. Price stabilizes ~$220. QSFP-DD800 pressure begins.",
"confidence": "HIGH",
"rationale": "Classic post-Peak adoption: price rapidly declining, proven technology, no innovation risk, clear ROI. Best time to standardize."
}`
));
rows.push(row(
"Where is 800G OSFP on the Hype Cycle? What signals point to its current phase?",
`HYPE CYCLE ANALYSIS — 800G OSFP
SIGNAL ANALYSIS:
┌─────────────────────────────────┬────────────────────────────────────┐
│ Signal │ Observation │
├─────────────────────────────────┼────────────────────────────────────┤
│ First announcement │ OSFP MSA published 2019 │
│ First 800G OSFP products │ Q2 2022 (limited EA) │
│ Press coverage │ HIGH — Cisco Live, Hot Chips 2023 │
│ Analyst reports │ Multiple overly optimistic (2022) │
│ Production deployments │ Limited — Meta AI, Google internal │
│ Vendor count (800G OSFP) │ 6-8 vendors, not all production │
│ Switch availability │ 3-4 platforms, premium pricing │
│ Price trend │ Still $1,200-2,000 (not declining) │
│ LinkedIn/Twitter buzz │ VERY HIGH — "800G is the future" │
│ Customer POC reports │ Mixed — power/cooling challenges │
│ Standards maturity │ Partially standardised (IEEE 800G) │
└─────────────────────────────────┴────────────────────────────────────┘
NORTON-BASS:
p = 0.025 (slower — new form factor, new power envelope)
q = 0.28 (moderate — AI tailwind but niche application)
t = 2 years production
Adoption(2) ≈ 12% — early majority not yet engaged
RESULT:
{
"technology": "800G OSFP",
"hype_phase": "PEAK_OF_INFLATED_EXPECTATIONS",
"position_pct": 28,
"bass_adoption_pct": 12,
"buy_signal": "WAIT",
"signal_color": "#FF8100",
"signal_label": "Hoher Hype, hohe Preise — warten",
"12m_forecast": "Price begins declining as Coherent 4Q2025 shipments scale. Watch for Nvidia SN5600 production ramp as proxy.",
"24m_forecast": "Trough ~2025-2026 as 1.6T hype displaces 800G narrative. Entry point for volume buyers.",
"risk": "Power/cooling constraints may limit deployments to AI-native facilities. Evaluate if your cooling supports >14W/port.",
"confidence": "MEDIUM-HIGH"
}`
));
rows.push(row(
"How do you calculate the Innovation Trigger probability score for a new optical technology announcement? Walk through the model.",
`INNOVATION TRIGGER SCORING MODEL
PURPOSE: Quantify whether a newly announced optical technology is a genuine Innovation Trigger on the Hype Cycle or premature hype.
INPUT VARIABLES (0-10 each):
1. STANDARDS_MATURITY — Is there an IEEE/OIF/MSA spec? (0=none, 10=ratified)
2. PROTOTYPE_EVIDENCE — Lab demo or production sample exists? (0=paper, 10=production sample)
3. VENDOR_COUNT — How many vendors announced? (0=one vendor, 10=5+ independent)
4. HYPERSCALER_SIGNAL — Has Google/Meta/AWS/Microsoft expressed interest? (0=no, 10=procurement)
5. PRICE_CLARITY — Is there a realistic price estimate? (0=none, 10=firm quote available)
6. TIMING_FEASIBILITY — How realistic is the stated GA date? (0=vaporware, 10=imminent)
7. PREDECESSOR_PROVEN — Does it build on proven prior generation? (0=entirely new, 10=direct evolution)
FORMULA:
trigger_score = (STANDARDS_MATURITY × 0.20) +
(PROTOTYPE_EVIDENCE × 0.20) +
(VENDOR_COUNT × 0.15) +
(HYPERSCALER_SIGNAL × 0.15) +
(PRICE_CLARITY × 0.10) +
(TIMING_FEASIBILITY × 0.10) +
(PREDECESSOR_PROVEN × 0.10)
THRESHOLDS:
• 7.5-10.0: True Innovation Trigger — technology is real, timeline credible
• 5.0-7.4: Emerging Technology — monitor, early pilots possible
• 2.5-4.9: Hype Risk Zone — announcements ahead of readiness
• 0.0-2.4: Paper Tiger — insufficient evidence, ignore until proof
EXAMPLE — 1.6T DR16 (2024):
standards=4, prototype=5, vendors=3, hyperscaler=4, price=2, timing=4, predecessor=7
Score = (4×0.20)+(5×0.20)+(3×0.15)+(4×0.15)+(2×0.10)+(4×0.10)+(7×0.10)
= 0.8 + 1.0 + 0.45 + 0.60 + 0.20 + 0.40 + 0.70 = 4.15
→ "Hype Risk Zone — real technology, premature timeline expectations"`
));
rows.push(row(
"Generate a full Hype Cycle status report for the top 8 optical networking technologies as of 2024, with buy signals.",
`OPTICAL NETWORKING HYPE CYCLE STATUS REPORT — 2024
┌──────────────────────┬───────────────────────────┬──────┬────────────┬─────────────────────────────────┐
│ Technology │ Phase │ Pos% │ Buy Signal │ Action │
├──────────────────────┼───────────────────────────┼──────┼────────────┼─────────────────────────────────┤
│ SFP+ 10G │ Plateau of Productivity │ 92 │ BUY_NOW │ Lowest prices ever — stock up │
│ QSFP28 100G SR4/LR4 │ Plateau of Productivity │ 80 │ BUY_NOW │ Price/performance sweet spot │
│ 400G QSFP-DD DR4/FR4 │ Slope of Enlightenment │ 55 │ BUY_NOW │ Prices still falling, proven │
│ 400G QSFP-DD 400ZR │ Slope of Enlightenment │ 48 │ CONSIDER │ Coherent maturing, wait Q3 2024 │
│ 800G QSFP-DD SR8/DR8 │ Peak of Inflated Expects │ 30 │ WAIT │ Prices 2× fair value, wait │
│ 800G ZR+ │ Peak of Inflated Expects │ 25 │ WAIT │ Limited availability, $8k+ │
│ 1.6T OSFP112/DR16 │ Innovation Trigger │ 8 │ HOLD │ 2025-2026 GA at best │
│ Co-packaged Optics │ Innovation Trigger │ 4 │ HOLD │ Still pre-production silicon │
└──────────────────────┴───────────────────────────┴──────┴────────────┴─────────────────────────────────┘
DETAILED NOTES:
QSFP28 100G (pos=80, BUY_NOW):
Street prices: SR4 $65-80, LR4 $130-160. Bass adoption >85%. No technical risk. OEM vs compatible delta at maximum — ideal compatible module window. Recommend 2-year procurement.
400G QSFP-DD (pos=55, BUY_NOW):
DR4 street: $280-350. FR4: $380-480. Proven in hyperscaler, multiple switch platforms. Price declining 20%/year. Lock in Q3 2024 before AI-driven demand spike.
800G (pos=28-30, WAIT):
Genuine technology, premature price point. Wait for Nvidia SN5600 GA ramp (Q4 2024) as demand signal. Budget now, purchase Q1-Q2 2025.
Co-packaged Optics (pos=4, HOLD):
Still 2-3 years from production. Intel/Broadcom/Nvidia all targeting 2026. Monitor Lightwave Summit announcements. Not investment-ready.
METHODOLOGY: Norton-Bass diffusion model (p,q per technology), weighted by price decay curve, standards maturity, and hyperscaler deployment signals.`
));
rows.push(row(
"A customer asks: 'Should I buy 400G transceivers now or wait for prices to fall further?' How should TIP_LLM answer using Hype Cycle data?",
`BUYING TIMING ANALYSIS — 400G Transceivers
SHORT ANSWER: Buy 400G now for planned deployments. Waiting >12 months yields diminishing returns relative to deployment cost.
DETAILED ANALYSIS:
PRICE TRAJECTORY (400G QSFP-DD DR4):
2019: $2,800 → 2021: $1,200 → 2023: $380 → 2024: $300 → 2025 est: $240 → 2026 est: $200
ANNUAL DECLINE RATE: ~22% per year (slowing — approaching floor)
REMAINING SAVINGS BY DELAY:
• Wait 6 months: save ~$33/module (11%)
• Wait 12 months: save ~$60/module (20%)
• Wait 24 months: save ~$100/module (33%)
vs. COST OF DELAY:
• 48-port deployment deferred 12 months: $60 × 48 = $2,880 saved
• But: 12 months of limited capacity, engineering workarounds, potentially missed SLA
• Engineer-hours managing temporary solutions: typically > $2,880
HYPE CYCLE POSITION: Slope of Enlightenment (pos=55)
→ This is historically the BEST buying window:
- Technology is proven (no failure risk)
- Early adopter premium is gone
- Prices still declining but at diminishing rate
- Supply is ample (multiple competing vendors)
- Support ecosystem mature
RECOMMENDATION:
• Procure 400G for confirmed deployments in next 18 months: YES, buy now
• Strategic stockpile beyond 18 months: NO, prices will fall further
• Emergency/tactical 400G: Always buy compatible (Flexoptix) — 65% savings vs OEM with same specs
• 800G planning: Earmark budget but defer 12-18 months — still in Peak phase`
));
// ─────────────────────────────────────────────────────────────────────────────
// Additional cross-capability examples
// ─────────────────────────────────────────────────────────────────────────────
rows.push(row(
"TIP_LLM, research and fully populate a database entry for the Lumentum QSFP-DD 400G-ZR coherent transceiver.",
`{
"part_number": "QSFP-DD-400G-ZR-C",
"vendor": "Lumentum",
"form_factor": "QSFP-DD",
"protocol": "400ZR",
"standard": "OIF 400ZR Implementation Agreement",
"speed_gbps": 400,
"channels": 1,
"modulation": "DP-16QAM",
"coherent": true,
"fiber_type": "SMF OS2",
"reach_km": 80,
"reach_label": "ZR",
"connector": "LC duplex",
"wavelength": "C-band tunable, 75GHz ITU-T grid",
"tx_power_dbm": { "min": -10, "max": 0 },
"rx_sensitivity_dbm": -22,
"osnr_required_db": 22.5,
"power_consumption_w": { "typical": 14, "max": 15 },
"dom_support": true,
"cmis_version": "CMIS 4.0",
"temperature_range_c": "0 to 70",
"fec": "oFEC (concatenated)",
"line_rate_gbps": 478.75,
"net_data_rate_gbps": 400,
"use_cases": [
"Metro DWDM replacement (up to 80km, no inline amplifiers)",
"DCI (Data Center Interconnect) direct connect",
"Open ROADM integration (OpenConfig)",
"ROADM bypass for campus/metro spans"
],
"compatible_switches": [
"Cisco NCS 1010 (native ZR host)",
"Arista 7130 (coherent uplink)",
"Juniper PTX10001-36MR",
"Any router/switch with QSFP-DD slot + CMIS 4.0 + ZR DSP awareness"
],
"price_usd": 4800,
"price_trend": "Declining — $8,000 (2021) → $4,800 (2024) → ~$3,500 est (2025)",
"hype_cycle_position": "SLOPE_OF_ENLIGHTENMENT (pos=48)",
"buy_signal": "CONSIDER — technology proven, prices still elevated vs long-term target",
"notes": "Requires host platform with ZR DSP awareness. Not plug-and-play on all QSFP-DD switches — confirm CMIS 4.0 + ZR mode support with switch vendor.",
"data_confidence": "HIGH",
"sources": ["OIF-400ZR-01.0 spec", "Lumentum datasheet 2024-Q1"]
}`
));
rows.push(row(
"Design a complete automated pipeline: crawl new transceiver announcements → extract data → evaluate for blog → update TIP database → update Hype Cycle.",
`AUTOMATED TIP INTELLIGENCE PIPELINE
OVERVIEW:
[Sources] → [Crawl] → [Extract] → [Evaluate] → [DB Update] → [Hype Cycle Update] → [Blog Trigger]
STAGE 1 — SOURCE MONITORING (every 6h):
• Crawlee monitor actors watching:
- vendor press pages (Lumentum, Coherent, II-VI, Inphi)
- IEEE 802.3 workgroup announcements
- OIF press releases
- Light Reading, Fierce Telecom, Data Center Knowledge RSS feeds
- NANOG mailing list digest
• Change detection: SHA256 of relevant page sections vs stored hash
STAGE 2 — EXTRACTION (per new page):
• Playwright actor: extract product specs using EXTRACTION_SCHEMA
• LLM parsing (TIP_LLM CAP-1): "Extract all transceiver specs from this page as JSON"
• Confidence scoring: HIGH/MEDIUM/LOW based on field completeness
STAGE 3 — BLOG EVALUATION (CAP-3):
• TIP_LLM scores content: technical_depth, factual_density, recency, uniqueness
• Score ≥ 6.0 → queue for Blog_LLM article generation
• Extracted claims → Blog_LLM context injection
STAGE 4 — DATABASE UPDATE:
• New part: INSERT into transceivers (status=pending_review)
• Existing part: UPDATE changed fields, log in audit table
• Price update: INSERT into price_observations (TimescaleDB)
• Flag significant price moves (>5%) for procurement alert
STAGE 5 — HYPE CYCLE RECALCULATION (CAP-5):
• On new technology announcement: recalculate trigger_score
• On adoption milestone (vendor count, price threshold): advance phase
• On pricing data: update Bass model parameters
• Output: updated hype_cycle table entry + dashboard notification
STAGE 6 — BLOG TRIGGER:
• If blog_worthy_score ≥ 7.0: POST to /api/blog/generate with topic + context
• Blog_LLM generates draft → status=draft → human review queue
• Auto-publish if confidence=HIGH and human reviewed within 48h
TECH STACK: Crawlee + BullMQ + PostgreSQL/TimescaleDB + TIP_LLM (CAP-1,3,5) + Blog_LLM`
));
rows.push(row(
"A crawl finds a transceiver with part number 'FTLX8574D3BNL'. It's not in the TIP database. What should TIP_LLM do?",
`NEW PART RESOLUTION WORKFLOW — FTLX8574D3BNL
STEP 1 — DECODE PART NUMBER:
• Prefix FTLX = Finisar/II-VI
• 8574 → 10G SFP+ family (857x = 850nm SR/SR+ family)
• D = Duplex LC
• 3 = Extended temperature (-40 to 85°C)
• B = Digital Diagnostic Monitoring
• NL = RoHS compliant, OEM variant (vs BCL = Cisco-coded variant)
STEP 2 — MAP TO BASE SPEC:
• FTLX8574D3BCL is the Cisco-coded version — already in many compatibility lists
• BNL = generic / non-Cisco coded → different EEPROM vendor ID
• Both are functionally identical: 10GBASE-SR, OM3 300m / OM4 400m, 850nm, LC duplex, DDM
STEP 3 — RESEARCH ENRICHMENT:
Fetch: https://ii-vi.com/product/ftlx8574d3bnl/
Extract: All spec fields → populate EXTRACTION_SCHEMA
Cross-reference: IEEE 802.3ae, SFF-8431
STEP 4 — DATABASE ACTION:
INSERT INTO transceivers:
part_number = 'FTLX8574D3BNL'
vendor_name = 'II-VI (Finisar)'
form_factor = 'SFP+'
speed_gbps = 10
protocol = '10GBASE-SR'
fiber_type = 'MMF'
reach_label = 'SR'
reach_m = 400
connector_type = 'LC'
wavelength_nm = 850
dom_support = true
temperature_range = '-40 to 85°C'
temp_type = 'industrial'
notes = 'Extended temp variant of FTLX8574D3BCL. Generic EEPROM (no vendor lock). RoHS compliant.'
data_confidence = 'HIGH'
status = 'active'
STEP 5 — LINK RELATIONSHIPS:
Add to compatibility matrix: same switches as FTLX8574D3BCL
Add vendor_equivalents: FTLX8574D3BCL (Cisco OEM coded), FP-SFP+-SR (Flexoptix compatible)`
));
// ─────────────────────────────────────────────────────────────────────────────
// Write output
// ─────────────────────────────────────────────────────────────────────────────
const outDir = "training-data";
mkdirSync(outDir, { recursive: true });
const outPath = join(outDir, "tip-llm-capabilities-v1.jsonl");
writeFileSync(outPath, rows.map(r => JSON.stringify(r)).join("\n") + "\n");
console.log(`✓ Written ${rows.length} training pairs to ${outPath}`);
console.log(` CAP-1 Transceiver Research: ${rows.filter(r => JSON.stringify(r).includes("CAP-1") || JSON.stringify(r).includes("transceiver")).length} approx`);
console.log(` Total size: ${(JSON.stringify(rows).length / 1024).toFixed(1)} KB`);