transceiver-db/sync/history/2026-05-09-magatama-all-lane-runpod-training-start.md
Rene Fichtmueller a20094755d feat(scraper): Flexoptix REST API sync robot + scheduler integration
Replaces the GraphQL/search-based Flexoptix scraper with a proper
Magento 2 REST API integration that delivers authoritative SKUs,
prices, stock levels and compatibility data.

New files:
- packages/scraper/src/robots/flexoptix-api-sync.ts
  Self-contained robot: auth → paginated fetch → normalize → DB write.
  Reads FLEXOPTIX_API_BASE_URL / _USERNAME / _PASSWORD from env.
  Returns { fetched, normalized, skipped, priceWrites, stockWrites }.
  No file intermediary — in-memory pipeline.

- scripts/import-flexoptix-catalog.ts
  One-shot CLI importer for the Pulso-generated JSONL (Codex handover).

- docs/FLEXOPTIX_CATALOG_IMPORT.md
  Runbook for manual import + per-SKU specifications enrichment.

Scheduler changes:
- Added sync:flexoptix-catalog queue + work() handler
- Scheduled every 2h at 0 */2 * * * (same cadence as legacy job)
- scrape:pricing:flexoptix kept as legacy GraphQL fallback

Also includes Codex-generated additions from this sprint:
- audiocodes-oem scraper, seed-batch35/36/37, db.ts improvements,
  sql/102 verification reconcile, README + package.json updates
2026-05-13 16:36:33 +02:00

1.5 KiB

MAGATAMA All-Lane RunPod Training Start

Date: 2026-05-09 23:09 UTC

Scope

  • Train all current MAGATAMA LLM lanes via RunPod:
    • magatamallm
    • fo_blogllm
    • tip_llm
    • pulso_llm
    • contact_llm

Preflight

  • MAGATAMA services were online on Erik.
  • Active RunPod endpoint reported by MAGATAMA: 0rmkf28w2g5gip.
  • RunPod worker kind: custom-magatama.
  • Dataset source: URL-based lane export.
  • Previous successful/adopted runs existed for:
    • magatamallm
    • fo_blogllm
    • tip_llm
  • No previous run existed yet for:
    • pulso_llm
    • contact_llm

Runner Fix

  • Fixed scripts/trigger_lane_training_once.py locally and on Erik.
  • MAGATAMA Gitea commit: 76d4054.
  • The script previously used stale API keys:
    • iterations
    • seedOnly
  • The MAGATAMA training API expects:
    • iters
    • seed_only
  • Added all mode to run all lanes sequentially.
  • Added streamed SSE logging so progress is visible during long RunPod runs.

Live Run

  • Started on Erik:
    • python3 -u scripts/trigger_lane_training_once.py all 500 false
  • Log:
    • /opt/magatama/logs/runpod-all-lanes-20260509T230549Z.log
  • First active lane:
    • magatamallm
  • First RunPod job:
    • 89627e7e-8533-45db-9fe8-eca994018aa6-e2
  • Initial magatamallm dataset:
    • 1375 train
    • 153 eval
    • 1528 total

Success Rule

  • Do not treat RunPod COMPLETED as success by itself.
  • A lane is only successful when:
    • the model artifact exists,
    • MAGATAMA imports/adopts it locally,
    • smoke checks pass,
    • the active alias/version is updated.