transceiver-db/sync/history/2026-05-06-magatama-zero-open-findings-and-resolver-truth.md
2026-05-06 08:38:07 +02:00

4.2 KiB

MAGATAMA Handoff — 2026-05-06

Scope

This handoff captures the MAGATAMA remediation block that started from "does MAGATAMA still really protect us?" and ended with a clean live state:

  • all open MAGATAMA findings resolved to 0
  • host-audit false positives corrected
  • resolver truth fixed so disabled Codex/Copilot are shown as disabled, not broken

Binding Outcome

Live verified on Erik and public MAGATAMA:

  • open findings: 0
  • /api/health -> status: ok
  • /api/active-resolvers ->
    • MAGATAMA Core: working
    • MagatamaLLM: working
    • Claude (secondary): working
    • Codex (secondary/manual): idle
    • Copilot (secondary/manual): idle
  • /api/protection-proof summary:
    • knownAssets: 79
    • hostsWithTelemetry: 27
    • assetsWithoutTelemetry: 52
    • queueExecuting: 0
    • queueBlocked: 0
    • queueFailed: 0

What Was Fixed

1. Threat/queue/code false positives

Earlier in the same remediation chain, MAGATAMA was repaired so that:

  • threat-news-seed items are triaged instead of piling up as operational noise
  • generated artifacts/reports are excluded from code scanning
  • queue counters only count unresolved findings instead of stale historical queue rows
  • training metrics prefer verified, deduped, Gitea-backed examples

This had already reduced the system from hundreds of open findings down to the final guard host-audit tail.

2. Guard host-audit truth

The last remaining open finding was:

  • guard | high | atlas-host-audit | Baseline: NUC Proxmox protection gaps | 192.168.178.10

The root cause was a too-naive Proxmox baseline rule. It treated these as hostile by default:

  • SSH on 22
  • rpcbind on 111
  • Proxmox UI on 8006
  • SPICE proxy on 3128
  • pve-firewall reporting disabled/running

For this internal management host, that logic was wrong. The audit was updated so that:

  • internal management exposure on 22, 8006, and 3128 is acceptable
  • rpcbind is acceptable when only default portmapper / status services are present
  • pve-firewall is accepted when the service is actually running
  • only additional RPC services or genuinely missing firewall runtime produce a warning

After redeploying the audit script and rerunning the audit, the Proxmox finding cleared and MAGATAMA reached 0 open findings.

3. Resolver truth

The public resolver view still looked unhealthy because Codex was shown as unavailable. Live inspection found:

  • codex_enabled = false
  • codex_bridge_url = ''
  • local codex bridge responds, but with 503 auth_required

This is not the same as "runtime outage". The dashboard logic was changed so:

  • if codex_enabled=false, Codex and Copilot show as:
    • status: idle
    • detail: In MAGATAMA settings disabled

This is now live on the public endpoint and is the correct interpretation unless the team intentionally re-enables Codex through settings and bridge/gateway auth.

Important Files Touched

MAGATAMA repo:

  • packages/core/src/routes/health-builders.ts
  • packages/core/src/routes/health-atlas.ts
  • packages/core/src/learning/auto-fix-scheduler.ts
  • packages/dashboard/src/server.ts
  • packages/code/src/types.ts
  • packages/core/src/routes/health-support.ts
  • scripts/security_atlas_host_audit.py

This sync repo:

  • sync/CURRENT.md
  • sync/history/2026-05-06-magatama-zero-open-findings-and-resolver-truth.md

Evidence Summary

Live evidence gathered during the remediation:

  • local codex bridge health:
    • HTTP 503
    • {"status":"auth_required","configured":true,"provider":"codex-cli"}
  • public active resolvers after dashboard patch:
    • Codex/Copilot show idle, not unavailable
  • live DB query on Erik:
    • SELECT count(*) AS open_findings FROM findings WHERE resolved_at IS NULL;
    • result: 0

Remaining Real Next Step

MAGATAMA is now clean from the findings/queue perspective, but not yet perfect in coverage:

  • 52 assets remain discovery/inventory-only without live telemetry
  • they are no longer open findings, but they are the next real operational expansion area

If work resumes on MAGATAMA protection depth, the next correct lane is:

  1. expand live telemetry coverage beyond the current 27 hosts
  2. keep the verified-only training corpus clean
  3. only re-enable Codex/Copilot after gateway or bridge auth is intentionally restored