From bba48d3e8411c757e177596f9269385d620f9bb8 Mon Sep 17 00:00:00 2001
From: Rene Fichtmueller <renefichtmueller@MacStudio-von-Rene-8.local>
Date: Sat, 9 May 2026 08:02:54 +0200
Subject: [PATCH] sync: record magatama atlas rematerialization fix

---
 sync/CURRENT.md                               |  69 +++++++++-
 ...ematerialization-and-stale-resolver-fix.md | 122 ++++++++++++++++++
 2 files changed, 190 insertions(+), 1 deletion(-)
 create mode 100644 sync/history/2026-05-09-magatama-atlas-rematerialization-and-stale-resolver-fix.md

diff --git a/sync/CURRENT.md b/sync/CURRENT.md
index 3833126..4a5aae0 100644
--- a/sync/CURRENT.md
+++ b/sync/CURRENT.md
@@ -1,9 +1,76 @@
 # Current TIP Sync State
 
-Updated: 2026-05-09 05:45 UTC
+Updated: 2026-05-09 05:58 UTC
 
 ## Newest Work
 
+- MAGATAMA Atlas rematerialization / anti-auto-resolve hardening completed live on 2026-05-09:
+  - operator problem:
+    - Atlas / Findings / Protection Proof had become dishonest again
+    - raw files on Erik still contained:
+      - `3` host audits
+      - `32` live Atlas scan devices
+    - but open findings had collapsed back to `0`
+    - Atlas UI therefore showed an implausibly clean state
+  - verified root cause:
+    - `packages/core/src/routes/health-builders.ts`
+      - `buildProtectionProofResponse()` read Atlas audits/snapshot but did **not** resync findings from those raw sources
+    - `packages/core/src/scheduler.ts`
+      - generic guard stale-auto-resolve treated Atlas-managed findings like ordinary scan findings
+      - newly rematerialized Atlas findings were therefore cleared again almost immediately
+  - code fixed:
+    - `packages/core/src/routes/health-builders.ts`
+      - added `readAtlasSnapshot()`
+      - added `syncAtlasAuditFindings(...)` + `syncAtlasExposureFindings(...)` via a new `syncAtlasOperationalFindings(...)` step
+      - `buildProtectionProofResponse()` now re-materializes Atlas-managed findings from current raw files before building the proof response
+    - `packages/core/src/scheduler.ts`
+      - introduced `ATLAS_MANAGED_FINDING_SOURCES`
+      - generic stale resolution now skips:
+        - `atlas-coverage-gap`
+        - `atlas-exposure`
+        - `atlas-host-audit`
+      - these sources are now left to their own verification-aware resolution logic
+  - live deployment on Erik:
+    - rebuilt `@magatama/core`
+    - synced:
+      - `/opt/magatama/packages/core/dist/routes/health-builders.js`
+      - `/opt/magatama/packages/core/dist/scheduler.js`
+    - restarted PM2 service:
+      - `magatama`
+  - live verification:
+    - before fix:
+      - Atlas raw files present:
+        - audits: `3`
+        - devices: `32`
+      - DB open findings: `0`
+    - after authenticated `/api/protection-proof` rebuild:
+      - DB open findings: `28`
+      - public `/api/findings?limit=5` now shows real open Atlas findings again
+      - public `/api/protection-proof` now reports:
+        - `knownAssets: 57`
+        - `hostsWithTelemetry: 22`
+        - `assetsWithoutTelemetry: 35`
+        - `auditedHosts: 3`
+        - `queueBlocked: 28`
+        - `switchbladeAssets: 5`
+        - `switchbladeRacks: 1`
+        - `switchbladeNmsNodes: 5`
+  - operational truth now:
+    - Atlas and Findings are no longer silently wiped clean by the generic stale resolver
+    - the remaining open state is again honest:
+      - most current open findings are `atlas-coverage-gap`
+      - they reflect missing live telemetry on known inventory/discovery assets
+  - operator note:
+    - browser cache / old UI state may still temporarily show the earlier empty Atlas
+    - hard refresh is required:
+      - `Cmd + Shift + R`
+  - important honest remainder:
+    - this closes the biggest Atlas truthfulness regression
+    - it does **not** yet solve every backend truth issue
+    - still pending:
+      - lane-specific RunPod artifact adoption / automatic version switch
+      - deeper Atlas policy refinement for which inventory-only assets should stay actionable vs informational
+
 - TIP automated equivalence research / manual queue cleanup completed on 2026-05-09:
   - operator intent:
     - products should be researched well enough that they do not need manual equivalence validation
diff --git a/sync/history/2026-05-09-magatama-atlas-rematerialization-and-stale-resolver-fix.md b/sync/history/2026-05-09-magatama-atlas-rematerialization-and-stale-resolver-fix.md
new file mode 100644
index 0000000..3dfb921
--- /dev/null
+++ b/sync/history/2026-05-09-magatama-atlas-rematerialization-and-stale-resolver-fix.md
@@ -0,0 +1,122 @@
+# MAGATAMA Atlas Rematerialization and Stale Resolver Fix
+
+Date: 2026-05-09
+
+## Problem
+
+MAGATAMA had fallen back into an untrustworthy state:
+
+- Atlas raw sources on Erik still existed and were current:
+  - `security-atlas-audits.json` with `3` audits
+  - `security-atlas-snapshot.json` with `32` devices
+- but open findings in Postgres had collapsed back to `0`
+- Atlas UI therefore looked implausibly empty / clean
+
+The operator requirement was explicit:
+
+- this must not silently happen again
+- MAGATAMA must reflect real protection gaps honestly
+
+## Root Cause
+
+Two independent backend problems combined:
+
+1. `buildProtectionProofResponse()` read Atlas raw files but did not resync findings from them.
+2. Generic stale finding auto-resolution in the scheduler treated Atlas-managed findings like ordinary guard findings and resolved them too aggressively.
+
+## Code Changes
+
+### `packages/core/src/routes/health-builders.ts`
+
+- added `readAtlasSnapshot()`
+- imported `syncAtlasAuditFindings(...)`
+- imported `syncAtlasExposureFindings(...)`
+- introduced `syncAtlasOperationalFindings(...)`
+- `buildProtectionProofResponse()` now calls that helper before building the proof payload
+
+Effect:
+
+- normal proof/Atlas reads now rematerialize current Atlas findings from the raw audit/snapshot files
+
+### `packages/core/src/scheduler.ts`
+
+- added:
+  - `ATLAS_MANAGED_FINDING_SOURCES`
+  - `isAtlasManagedFindingSource(...)`
+- generic stale resolution now skips:
+  - `atlas-coverage-gap`
+  - `atlas-exposure`
+  - `atlas-host-audit`
+
+Effect:
+
+- Atlas-managed findings are no longer erased by the generic guard stale resolver
+- they stay under their own verification-aware lifecycle
+
+## Live Deployment
+
+Deployed to Erik:
+
+- rebuilt `@magatama/core`
+- synced:
+  - `/opt/magatama/packages/core/dist/routes/health-builders.js`
+  - `/opt/magatama/packages/core/dist/scheduler.js`
+- restarted PM2 app:
+  - `magatama`
+
+## Live Verification
+
+### Before
+
+- raw files existed:
+  - audits: `3`
+  - devices: `32`
+- DB open findings: `0`
+
+### After protected proof rebuild
+
+- authenticated local `/api/protection-proof` trigger on Erik
+- DB open findings rematerialized to: `28`
+
+### Public verification
+
+Public MAGATAMA APIs now again expose real open state:
+
+- `/api/findings?limit=5`
+  - returns open `atlas-coverage-gap` findings again
+- `/api/protection-proof`
+  - `knownAssets: 57`
+  - `hostsWithTelemetry: 22`
+  - `assetsWithoutTelemetry: 35`
+  - `auditedHosts: 3`
+  - `queueBlocked: 28`
+  - `switchbladeAssets: 5`
+  - `switchbladeRacks: 1`
+  - `switchbladeNmsNodes: 5`
+
+## Operational Truth
+
+The major Atlas truthfulness regression is fixed:
+
+- Atlas and Findings no longer silently collapse to a fake clean state when raw Atlas data still contains real problems
+
+What remains true:
+
+- most currently open Atlas findings are coverage gaps
+- they represent real missing live telemetry on known assets
+
+## Remaining Work
+
+Still not fully closed:
+
+- lane-specific RunPod artifact adoption and automatic version switching
+- further Atlas policy refinement so inventory-only assets can be split more cleanly into:
+  - actionable operational gaps
+  - informational inventory/discovery context
+
+## Operator Note
+
+If the browser still shows the older empty Atlas state after deployment:
+
+- hard refresh:
+  - `Cmd + Shift + R`