rene/transceiver-db

Fork 0

Rene Fichtmueller bba48d3e84 sync: record magatama atlas rematerialization fix

2026-05-09 08:02:54 +02:00

3.2 KiB

Raw Blame History

MAGATAMA Atlas Rematerialization and Stale Resolver Fix

Date: 2026-05-09

Problem

MAGATAMA had fallen back into an untrustworthy state:

Atlas raw sources on Erik still existed and were current:
- security-atlas-audits.json with 3 audits
- security-atlas-snapshot.json with 32 devices
but open findings in Postgres had collapsed back to 0
Atlas UI therefore looked implausibly empty / clean

The operator requirement was explicit:

this must not silently happen again
MAGATAMA must reflect real protection gaps honestly

Root Cause

Two independent backend problems combined:

buildProtectionProofResponse() read Atlas raw files but did not resync findings from them.
Generic stale finding auto-resolution in the scheduler treated Atlas-managed findings like ordinary guard findings and resolved them too aggressively.

Code Changes

`packages/core/src/routes/health-builders.ts`

added readAtlasSnapshot()
imported syncAtlasAuditFindings(...)
imported syncAtlasExposureFindings(...)
introduced syncAtlasOperationalFindings(...)
buildProtectionProofResponse() now calls that helper before building the proof payload

Effect:

normal proof/Atlas reads now rematerialize current Atlas findings from the raw audit/snapshot files

`packages/core/src/scheduler.ts`

added:
- ATLAS_MANAGED_FINDING_SOURCES
- isAtlasManagedFindingSource(...)
generic stale resolution now skips:
- atlas-coverage-gap
- atlas-exposure
- atlas-host-audit

Effect:

Atlas-managed findings are no longer erased by the generic guard stale resolver
they stay under their own verification-aware lifecycle

Live Deployment

Deployed to Erik:

rebuilt @magatama/core
synced:
- /opt/magatama/packages/core/dist/routes/health-builders.js
- /opt/magatama/packages/core/dist/scheduler.js
restarted PM2 app:
- magatama

Live Verification

Before

raw files existed:
- audits: 3
- devices: 32
DB open findings: 0

After protected proof rebuild

authenticated local /api/protection-proof trigger on Erik
DB open findings rematerialized to: 28

Public verification

Public MAGATAMA APIs now again expose real open state:

/api/findings?limit=5
- returns open atlas-coverage-gap findings again
/api/protection-proof
- knownAssets: 57
- hostsWithTelemetry: 22
- assetsWithoutTelemetry: 35
- auditedHosts: 3
- queueBlocked: 28
- switchbladeAssets: 5
- switchbladeRacks: 1
- switchbladeNmsNodes: 5

Operational Truth

The major Atlas truthfulness regression is fixed:

Atlas and Findings no longer silently collapse to a fake clean state when raw Atlas data still contains real problems

What remains true:

most currently open Atlas findings are coverage gaps
they represent real missing live telemetry on known assets

Remaining Work

Still not fully closed:

lane-specific RunPod artifact adoption and automatic version switching
further Atlas policy refinement so inventory-only assets can be split more cleanly into:
- actionable operational gaps
- informational inventory/discovery context

Operator Note

If the browser still shows the older empty Atlas state after deployment:

hard refresh:
- Cmd + Shift + R

3.2 KiB Raw Blame History