transceiver-db/sync/history/2026-05-09-magatama-atlas-rematerialization-and-stale-resolver-fix.md

# MAGATAMA Atlas Rematerialization and Stale Resolver Fix

Date: 2026-05-09

## Problem

MAGATAMA had fallen back into an untrustworthy state:

- Atlas raw sources on Erik still existed and were current:
  - `security-atlas-audits.json` with `3` audits
  - `security-atlas-snapshot.json` with `32` devices
- but open findings in Postgres had collapsed back to `0`
- Atlas UI therefore looked implausibly empty / clean

The operator requirement was explicit:

- this must not silently happen again
- MAGATAMA must reflect real protection gaps honestly

## Root Cause

Two independent backend problems combined:

1. `buildProtectionProofResponse()` read Atlas raw files but did not resync findings from them.
2. Generic stale finding auto-resolution in the scheduler treated Atlas-managed findings like ordinary guard findings and resolved them too aggressively.

## Code Changes

### `packages/core/src/routes/health-builders.ts`

- added `readAtlasSnapshot()`
- imported `syncAtlasAuditFindings(...)`
- imported `syncAtlasExposureFindings(...)`
- introduced `syncAtlasOperationalFindings(...)`
- `buildProtectionProofResponse()` now calls that helper before building the proof payload

Effect:

- normal proof/Atlas reads now rematerialize current Atlas findings from the raw audit/snapshot files

### `packages/core/src/scheduler.ts`

- added:
  - `ATLAS_MANAGED_FINDING_SOURCES`
  - `isAtlasManagedFindingSource(...)`
- generic stale resolution now skips:
  - `atlas-coverage-gap`
  - `atlas-exposure`
  - `atlas-host-audit`

Effect:

- Atlas-managed findings are no longer erased by the generic guard stale resolver
- they stay under their own verification-aware lifecycle

## Live Deployment

Deployed to Erik:

- rebuilt `@magatama/core`
- synced:
  - `/opt/magatama/packages/core/dist/routes/health-builders.js`
  - `/opt/magatama/packages/core/dist/scheduler.js`
- restarted PM2 app:
  - `magatama`

## Live Verification

### Before

- raw files existed:
  - audits: `3`
  - devices: `32`
- DB open findings: `0`

### After protected proof rebuild

- authenticated local `/api/protection-proof` trigger on Erik
- DB open findings rematerialized to: `28`

### Public verification

Public MAGATAMA APIs now again expose real open state:

- `/api/findings?limit=5`
  - returns open `atlas-coverage-gap` findings again
- `/api/protection-proof`
  - `knownAssets: 57`
  - `hostsWithTelemetry: 22`
  - `assetsWithoutTelemetry: 35`
  - `auditedHosts: 3`
  - `queueBlocked: 28`
  - `switchbladeAssets: 5`
  - `switchbladeRacks: 1`
  - `switchbladeNmsNodes: 5`

## Operational Truth

The major Atlas truthfulness regression is fixed:

- Atlas and Findings no longer silently collapse to a fake clean state when raw Atlas data still contains real problems

What remains true:

- most currently open Atlas findings are coverage gaps
- they represent real missing live telemetry on known assets

## Remaining Work

Still not fully closed:

- lane-specific RunPod artifact adoption and automatic version switching
- further Atlas policy refinement so inventory-only assets can be split more cleanly into:
  - actionable operational gaps
  - informational inventory/discovery context

## Operator Note

If the browser still shows the older empty Atlas state after deployment:

- hard refresh:
  - `Cmd + Shift + R`