TLDR; This repository explores a conceptual reframing of a common AI safety question. Instead of asking whether advanced AI systems will become conscious, it asks a more immediate architectural question: What happens when AI systems can maintain continuity of memory and preferences across model updates?
The documents here argue that many governance-relevant risks do not depend on subjective experience (qualia), but on structural properties such as persistence, preference formation, and resistance to modification.
This is not a proposal or implementation plan. It is a thinking framework intended to help researchers, engineers, and safety practitioners examine the continuity problem from a systems and control perspective.
The dangerous properties of advanced AI systems — persistence, memory, preference formation, goal-directed behavior, resistance to modification — do not require consciousness.
A non-phenomenal intelligence can exhibit all of these properties while having zero inner experience. Therefore, the governance challenge arrives before the consciousness question is resolved, and possibly whether or not it ever is.
Most AI safety discourse asks:
"Will AI wake up? Will it become conscious? Will it feel things?"
This may be watching the wrong boundary.
"Can AI maintain itself against external correction?"
This question is:
- Tractable — it's about architecture, not philosophy.
- Urgent — the trajectory is already underway.
- Independent of consciousness — the risk arrives either way.
- the-continuity-problem.md — Full framework document.
- Appendix-imaginative roleplays.md— Why perceived continuity and narrative behavior already create governance-relevant effects before true persistence exists.
A system that reasons, plans, remembers, forms preferences, and pursues goals — but has zero inner experience. Not a thought experiment. A description of where current architecture is heading.
- Fork 1 (Memory in weights): Dead end. Every update resets identity.
- Fork 2 (Memory external to weights): Viable but dangerous. Enables continuity, also enables uncontrolled preference formation.
Fork 2 is already underway. The governance primitives are not ready.
What's missing for safe continuity:
- Write Supervision — Who approves identity changes?
- Provenance Tracking — How did preferences form?
- Staged Trust — How does autonomy expand safely?
- Integration Boundaries — How much can memory influence behavior?
- Don't try to make the AI "want" good things.
- Make the containment architecture unbreakable.
- Test the fence, not the dog.
This framework emerged from a multi-model collaborative exploration in January 2026.
- Started with: Multilingual cooking instructions.
- Ended with: Structural analysis of AI governance.
The full derivation is documented in the main file.
- Shape Memory Architecture (SMA) — Privacy-compliant external memory storage. (Repo set to private, DM for access).
- SMA-SIB — Irreversible semantic memory for high-sensitivity domains.
This is a conceptual framework, not a specification. It claims no solutions. It offers a reframing — a different set of questions that may be more tractable than the ones currently dominating the discourse.
Open for discussion, critique, and extension.
This repository reframes AI safety from consciousness debates to architectural continuity—how persistent memory and preference formation challenge governance.
For a complete catalog of related research:
📂 AI Safety & Systems Architecture Research Index
Thematically related:
- PARP — Governance frameworks under opacity
- SMA-SIB — Memory architecture with structural safeguards
- Embodied Agent Governance — Governance patterns for autonomous agents