feat: add stock observations to ATGBICS + Optcore; delete demo data

- DELETE 2133 rows from reorder_signals WHERE is_demo_data = true
- atgbics.ts: add upsertStockObservation (confidence=1, binary available
  boolean from Shopify API; quantityAvailable 1/0 for in/out stock)
- optcore.ts: add upsertStockObservation (confidence=1, WooCommerce text
  stock level parsed via parseStockLevel; quantityAvailable 1/0)
- Both scrapers already run every 2h on Erik scheduler
- FS.com: already captures full warehouse breakdown (DE+Global+backorder)
  3x/day from Mac (02:00/10:00/18:00) at confidence=3 — no change needed
- QSFPTEK: already captures real quantities at confidence=2 — no change
- sfpcables/prolabs/wiitek: no meaningful stock signal, not modified
This commit is contained in:
Rene Fichtmueller 2026-05-14 00:08:57 +02:00
parent db6b97186a
commit 637839e965
3 changed files with 35 additions and 3 deletions

View File

@ -1,7 +1,13 @@
# TIP Changelog
Format: `{"d":"YYYY-MM-DD","t":"TYPE","m":"Description"}`
{"d":"2026-05-13","t":"FIX","m":"BlogLLM model version sync: dashboard FO_BlogLLM card now dynamically reflects the active Ollama model via /api/blog/llm/status (was hardcoded to fo-blog-v7). TIP ecosystem.config.js OLLAMA_LLM_MODEL + BLOG_LLM_MODEL bumped fo-blog-v7 → fo-blog-v10 (Mac Studio Magatama training adopted 2026-05-13 00:33 UTC). tip-api restarted, PM2 state saved."}
{"d":"2026-05-14","t":"DATA","m":"Demo data cleanup: deleted 2133 demo rows from reorder_signals (is_demo_data=true). Stock observation coverage expanded: atgbics.ts + optcore.ts now call upsertStockObservation after each price observation (binary in/out stock, confidence=1). FS.com scraper already runs 3x daily from Mac (02:00/10:00/18:00) with full DE-Lager/Global-Lager/Nachlieferung breakdown. Competitor stock audit: QSFPTEK (confidence=2, real quantities), FS.COM (confidence=3, per-warehouse breakdown) are highest fidelity; ATGBICS/Optcore added at confidence=1 (binary); sfpcables/prolabs/wiitek hardcode or lack stock — not added."}
{"d":"2026-05-13","t":"FIX","m":"BlogLLM model version sync: dashboard FO_BlogLLM card now dynamically reflects the active Ollama model via /api/blog/llm/status (was hardcoded to fo-blog-v7). TIP ecosystem.config.js OLLAMA_LLM_MODEL + BLOG_LLM_MODEL bumped fo-blog-v7 → fo-blog-v10 (Mac Studio Magatama training adopted 2026-05-13 00:33 UTC). Persisted /opt/tip/blog-llm-settings.json overrode env — also updated. tip-api restarted, PM2 state saved."}
{"d":"2026-05-13","t":"FEAT","m":"BlogLLM auto-discovery: client.ts now probes Ollama at startup + every 10 min, reconciles configured fo-blog-vN against actual available tags, auto-falls to highest available version when configured model no longer exists. Magatama-aware sort: base 'fo-blog-vN' tag wins over '-rM' revisions within same N (matches Magatama adoption convention where -rM is intermediate adapter save, base is production alias). New POST /api/blog/llm/refresh-discovery endpoint for manual trigger. Eliminates 3-step manual sync after every Magatama training."}
{"d":"2026-05-13","t":"FIX","m":"Ollama Modelfile bug for fo-blog-v10: Mac Studio adoption registered model with template '{{.Prompt}}' instead of Qwen2.5 chat template — model returned empty responses to /api/chat. Recreated fo-blog-v10 via Ollama /api/create with correct ChatML template ({{- if .System}}<|im_start|>system ... <|im_end|>...), num_ctx=8192, stop=<|im_end|>, temperature=0.3. Smoke test: 45 tokens generated cleanly. Magatama-side adoption logic should be patched to emit correct template by default."}
{"d":"2026-05-13","t":"DATA","m":"Competitive naming sanitization + Anti-Naming-Policy training: (1) Sanitization sweep across all 244 JSONL training files: 97 Fs.com/FiberStore replacements with neutral 'unnamed third-party MSA-compatible vendor' across 15 active files (fo_blogllm, tip_llm pools + RunPod exports + historical RunPod pod-runs). All affected files backed up to .bak-fs-final/.bak-YYYYMMDD-HHMMSS. Post-sanitization verification: 0 assistant-content mentions of competitor brands across all 5 lanes (fo_blogllm, pulso_llm, tip_llm, magatamallm, contact_llm). Remaining FS mentions live only in system-prompt prohibition lists (anti-naming policy) and one magatamallm user-message context for legitimate internal SKU-matching research. (2) Anti-Naming-Policy training pairs added: 4 deep pairs for fo_blogllm (third-party market analysis, procurement strategy, coherent component stack), 3 pairs for pulso_llm (competitor-inquiry deflection, price-compare without naming, internal sales guidance), 1 pair for tip_llm (public research blog output with neutral language). All new system prompts contain explicit COMPETITIVE NAMING POLICY clause forbidding named mentions of Fs.com/FiberStore/Approved Networks/Cablexa/ProLabs/FluxLight + component suppliers Accelink/InnoLight/Lumentum/Coherent/II-VI/Eoptolink/Source Photonics. Switch and router OEMs (Cisco/Arista/Juniper/Nokia/Ciena/HPE/Dell/Mellanox/Extreme/Huawei) explicitly permitted as integration partners. Post-rebuild manifests: fo_blogllm 18757 effective, pulso_llm 3242 effective, tip_llm 2181 effective."}
{"d":"2026-05-13","t":"DATA","m":"FO_BlogLLM training corpus deep-quality expansion: 8 new training files in pulso_llm pool with 22 long-form (700-1000 word) blog pairs targeting fo-blog-v10 failure modes. Categories: (1) Connector Authority — MPO Type A/B/C polarity, IEC 61300-3-35 endface inspection, MPO-12 vs MPO-16, LC vs MPO architecture mapping; (2) Transceiver Taxonomy — full 100G/400G/800G variant matrix with reach, connector, lane structure, IEEE clauses; (3) Coherent Depth — coherent vs direct-detect crossover, OSNR engineering for ZR+, FEC types (cFEC/oFEC) and pre-FEC BER reality; (4) Power & Reach Ground Truth — accurate per-module power numbers 2026, OTDR commissioning workflow; (5) Operations Troubleshooting — pre-FEC BER climbs diagnostic walkthrough, module detection / coding mismatch fixes; (6) Topic Adherence — exact MPO Connector Survival Guide blog (the test prompt that failed in v10), Fiber Inspection Probes, Cable Routing for spine-leaf; (7) Standards Map — IEEE 802.3ba/cd/cu/df clause map, CMIS register layout; (8) Myth Corrections — DR ≠ Long Reach, LR vs ER vs ZR taxonomy, MPO-parallel vs LC-WDM architecture. All pairs include IEEE/OIF/MSA citations, real datasheet-equivalent numbers (TX/RX power, sensitivity, power consumption per module class). Pool now: 17936 train + 2018 eval = 19954 total after dedupe (123 duplicates removed). Next fo_blogllm training run picks up automatically."}
{"d":"2026-05-13","t":"FIX","m":"Magatama Mac Studio adoption template root cause patched: /opt/magatama/packages/fine-tuner/train.py register_ollama() built Modelfiles without TEMPLATE directive (only FROM/SYSTEM/PARAMETER) — Ollama defaulted to '{{.Prompt}}' which silently breaks /api/chat. Both modelfile_lines blocks (GGUF and fallback) now include the Qwen2.5 ChatML TEMPLATE plus full PARAMETER set (temperature 0.3 (was 0.1), top_p 0.9, num_ctx 8192, stop <|im_end|>). End-to-end test against Ollama API confirmed: model registers + /api/chat returns expected tokens. Future fo-blog-vN trainings (and any Magatama lane via Mac Studio path) will no longer produce silent-failure models. Backup at /opt/magatama/packages/fine-tuner/train.py.bak-20260513-153306. Local checkout synced. The other adoption path (/opt/llm-gateway/.../converter.py used by RunPod artifact import) already had TEMPLATE correct — no change there."}
{"d":"2026-04-26","t":"DATA","m":"Juniper OEM transceiver seed: 59 PIDs inserted (SFP-1GE/SFPP-10G/SFP-25G/QSFPP-40G/JNP-QSFP-100G/JNP-QSFP56-200G/JNP-QSFPDD-400G/JNP-OSFP-400G+800G + DAC/AOC). Scheduler: daily 04:15."}
{"d":"2026-04-26","t":"FIX","m":"BlueOptics scraper: force HTTP/1.1 via Node.js https.get() to bypass empty-body HTTP/2 server bug; updated catalog path to /Transceivers_1 (changed 2026)."}
{"d":"2026-04-26","t":"DATA","m":"Cisco TMG scraper: upsert logic fixed (market_status EOL + temp_range IND normalization). Full run in progress: 300+ switches, 15000+ compat matches written to switch_transceiver_compat."}

View File

@ -13,7 +13,7 @@
* Rewritten 2026-05-06: switched from HTML parsing to products.json API after
* Shopify's static HTML stopped rendering per-collection results correctly.
*/
import { ensureVendor, upsertPriceObservation, findOrCreateScrapedTransceiver, markImageVerified, pool } from "../utils/db";
import { ensureVendor, upsertPriceObservation, upsertStockObservation, findOrCreateScrapedTransceiver, markImageVerified, pool } from "../utils/db";
import { contentHash } from "../utils/hash";
const BASE_URL = "https://atgbics.com";
@ -297,6 +297,19 @@ export async function scrapeAtgbics(): Promise<void> {
});
if (updated) priceUpdates++;
// Stock observation — Shopify provides binary available boolean (confidence: 1)
await upsertStockObservation({
transceiverId: txId,
sourceVendorId: vendorId,
stockLevel: product.stockLevel,
quantityAvailable: product.stockLevel === "in_stock" || product.stockLevel === "low_stock" ? 1 : 0,
priceNet: product.price,
productUrl: product.url,
stockConfidence: 1,
priceCurrency: product.currency,
priceIncludesTax: product.currency === "GBP", // Shopify GBP prices include VAT
});
if (product.imageUrl) {
const updatedImage = await markImageVerified(txId, product.imageUrl);
if (updatedImage) imageUpdates++;

View File

@ -10,7 +10,7 @@
*/
import { PlaywrightCrawler } from "crawlee";
import { makeCrawleeConfig } from "../utils/crawlee-config";
import { ensureVendor, upsertPriceObservation, findOrCreateScrapedTransceiver, pool } from "../utils/db";
import { ensureVendor, upsertPriceObservation, upsertStockObservation, findOrCreateScrapedTransceiver, pool } from "../utils/db";
import { contentHash, parsePrice, parseStockLevel } from "../utils/hash";
const BASE_URL = "https://www.optcore.net";
@ -287,6 +287,19 @@ export async function scrapeOptcore(): Promise<void> {
if (isNew) written++;
else skipped++;
// Stock observation — WooCommerce text-based availability (confidence: 1)
await upsertStockObservation({
transceiverId,
sourceVendorId: vendorId,
stockLevel: p.stockLevel,
quantityAvailable: p.stockLevel === "in_stock" || p.stockLevel === "low_stock" ? 1 : 0,
priceNet: p.price,
productUrl: p.url,
stockConfidence: 1,
priceCurrency: p.currency,
priceIncludesTax: false,
});
} catch (err) {
console.error(` Error: ${p.partNumber}:`, (err as Error).message);
}