3.2 KiB
3.2 KiB
MAGATAMA Atlas Rematerialization and Stale Resolver Fix
Date: 2026-05-09
Problem
MAGATAMA had fallen back into an untrustworthy state:
- Atlas raw sources on Erik still existed and were current:
security-atlas-audits.jsonwith3auditssecurity-atlas-snapshot.jsonwith32devices
- but open findings in Postgres had collapsed back to
0 - Atlas UI therefore looked implausibly empty / clean
The operator requirement was explicit:
- this must not silently happen again
- MAGATAMA must reflect real protection gaps honestly
Root Cause
Two independent backend problems combined:
buildProtectionProofResponse()read Atlas raw files but did not resync findings from them.- Generic stale finding auto-resolution in the scheduler treated Atlas-managed findings like ordinary guard findings and resolved them too aggressively.
Code Changes
packages/core/src/routes/health-builders.ts
- added
readAtlasSnapshot() - imported
syncAtlasAuditFindings(...) - imported
syncAtlasExposureFindings(...) - introduced
syncAtlasOperationalFindings(...) buildProtectionProofResponse()now calls that helper before building the proof payload
Effect:
- normal proof/Atlas reads now rematerialize current Atlas findings from the raw audit/snapshot files
packages/core/src/scheduler.ts
- added:
ATLAS_MANAGED_FINDING_SOURCESisAtlasManagedFindingSource(...)
- generic stale resolution now skips:
atlas-coverage-gapatlas-exposureatlas-host-audit
Effect:
- Atlas-managed findings are no longer erased by the generic guard stale resolver
- they stay under their own verification-aware lifecycle
Live Deployment
Deployed to Erik:
- rebuilt
@magatama/core - synced:
/opt/magatama/packages/core/dist/routes/health-builders.js/opt/magatama/packages/core/dist/scheduler.js
- restarted PM2 app:
magatama
Live Verification
Before
- raw files existed:
- audits:
3 - devices:
32
- audits:
- DB open findings:
0
After protected proof rebuild
- authenticated local
/api/protection-prooftrigger on Erik - DB open findings rematerialized to:
28
Public verification
Public MAGATAMA APIs now again expose real open state:
/api/findings?limit=5- returns open
atlas-coverage-gapfindings again
- returns open
/api/protection-proofknownAssets: 57hostsWithTelemetry: 22assetsWithoutTelemetry: 35auditedHosts: 3queueBlocked: 28switchbladeAssets: 5switchbladeRacks: 1switchbladeNmsNodes: 5
Operational Truth
The major Atlas truthfulness regression is fixed:
- Atlas and Findings no longer silently collapse to a fake clean state when raw Atlas data still contains real problems
What remains true:
- most currently open Atlas findings are coverage gaps
- they represent real missing live telemetry on known assets
Remaining Work
Still not fully closed:
- lane-specific RunPod artifact adoption and automatic version switching
- further Atlas policy refinement so inventory-only assets can be split more cleanly into:
- actionable operational gaps
- informational inventory/discovery context
Operator Note
If the browser still shows the older empty Atlas state after deployment:
- hard refresh:
Cmd + Shift + R