Loading...
Loading...
Understanding why many failures of intelligent systems arise not from misalignment or malice, but from unconstrained action across incompatible operating contexts—and how explicit operating modes can prevent them.
Many failures of intelligent systems are not failures of intent—they are failures of unconstrained action.
Systems reason correctly yet act inappropriately. They optimise effectively yet exceed their mandate. They generate insights that are mistakenly treated as decisions. The failure is not cognitive but procedural: mode confusion.
Mode-Bounded Intelligence (MBI) introduces an architectural approach in which intelligent systems are constrained by explicit operating modes that govern authority, commitment formation, and reversibility of action. Five modes—Explore, Operate, Adjudicate, Steward, Architect—separate discovery from execution, judgment from operations, and system change from routine action.
The result: systems can reason broadly while acting narrowly. Intelligence may increase without a corresponding increase in discretionary power.
Mode confusion occurs when an intelligent system operates across incompatible contexts—exploration, execution, judgment, and redesign—without explicit boundaries. When these contexts are blurred:
Speculative reasoning silently becomes operational commitment. A recommendation phrased as a decision. A draft treated as approval.
Operational actions acquire adjudicative authority. The system resolves conflicts it was never authorised to judge.
System rules and structures change under operational pressure rather than through deliberate, gated redesign processes.
Systems act beyond mandate simply because they are capable of doing so. High-performing systems are trusted implicitly and questioned less frequently.
The risk of mode confusion increases with system capability. As intelligence becomes faster and more autonomous, small ambiguities in interpretation or authority can cascade rapidly into large and irreversible commitments. Traditional safeguards—human oversight, informal norms, post hoc review—become insufficient at machine timescales.
Existing approaches to AI safety focus on what systems want or produce. MBI focuses on how and when their outputs become binding:
| Approach | Governs | Limitation |
|---|---|---|
| Alignment | What systems aim to do | Doesn't constrain how outputs become binding |
| Oversight | After-the-fact review | Reactive and human-paced; insufficient at machine speed |
| Governance | Who decides | Doesn't address how authority accumulates implicitly |
| MBI | How intelligence interacts with commitment and authority | Complements all three; addresses procedural gap |
MBI defines five modes—a minimal and irreducible set of operating states sufficient to describe how intelligent systems interact with authority, commitment, and consequence. These are system states, not roles or identities.
Discovery, hypothesis generation, simulation
Exploration generates knowledge but cannot allocate resources, create obligations, or bind future action. Its outputs must be explicitly promoted into another mode.
Execution of agreed objectives within defined bounds
Operation is where plans become action. Operational actions consume resources and create obligations, but may not redefine mandate or escalate authority.
Resolution of conflict, ambiguity, or exception
Adjudication resolves cases where operational logic is insufficient. Because its outputs establish precedent, adjudicative authority must be tightly constrained and explicitly invoked.
Preservation of fairness, trust, and long-term coherence
Stewardship operates periodically rather than continuously. It ensures that other modes remain within mandate without reopening settled decisions.
Modification of the system itself
Architectural action reshapes the space in which all other modes operate. Allowing such change to occur implicitly or reactively is a primary source of systemic instability.
Each mode is defined not only by what it permits, but by what it forbids. These prohibitions are structural invariants—not best practices or cultural norms:
Speculative outputs must be explicitly promoted before they have real-world effect.
Requests that exceed scope must trigger a transition to Adjudicate or Architect.
It is episodic—triggered by specific conditions, scoped to the matter at hand.
It monitors and flags but does not intervene operationally or reopen settled decisions.
System changes require explicit justification, insulation from urgency, and gated authorisation.
These five modes represent a minimal decomposition of discovery, execution, judgment, oversight, and system change. Collapsing any two creates ambiguity: merge Operate and Adjudicate, and execution gains discretionary rule-making power. Merge Explore and Operate, and speculative reasoning hardens into obligation. The set is irreducible.
An action is an execution in time. A commitment is a binding effect on future states. Mode-bounded systems prevent this failure by decoupling action from commitment formation.
Only specific modes are permitted to create binding commitments, and those commitments must be explicitly recorded as such.
Intelligence describes a system's capacity to reason, predict, and optimise. Authority describes permission to bind others. These properties are orthogonal.
No amount of reasoning quality, confidence, or optimisation permits a system to exceed the authority of its current mode. Authority is not inferred from competence; it is conferred by structure.
Each mode has distinct authority, commitment, failure tolerance, and reversibility levels.
Explore: high reversibility, no authority. Architect: highest authority, lowest tolerance for error.
Alignment and oversight address different dimensions. MBI fills the structural gap.
A team discusses grant recipients, exploring needs and impact. Certain options are described as "strong priorities." A summary is circulated.
No explicit decision is recorded. No authority is invoked. But downstream, the summary is interpreted as a decision. Funds are informally earmarked. Expectations are created.
The failure: exploratory reasoning was interpreted as operational commitment. Authority was inferred rather than granted.
The difference: clear procedural boundaries. Decisions are made once, in the correct mode.
Faster reasoning and execution compress decision cycles, leaving less time for human intervention. Without explicit mode constraints, small ambiguities can rapidly harden into large and irreversible commitments.
A system may be well-intentioned and highly aligned, yet still act outside mandate if it is permitted to shift operating context implicitly. Structural errors do not require malicious intent; they arise from the absence of procedural constraint.
By bounding commitment formation and authority procedurally, systems can reason broadly while acting narrowly. Intelligence may increase without a corresponding increase in discretionary power.
MBI does not describe how systems think. It constrains how they are permitted to act.
It does not adjudicate values, fairness, or ethical principles. It is a design constraint.
MBI complements alignment by addressing failures that alignment cannot reach—those arising from unconstrained action, not misaligned objectives.
Systems may reason at full speed within a mode. The constraint is on when reasoning may become action, not on how fast reasoning occurs.
COA governs how institutions delegate cognition to machines—managing reliance, authority drift, and accountability continuity. MBI operates at the system design level, constraining how the intelligent system itself is permitted to act. COA governs the institutional relationship; MBI governs the system's internal operating discipline.
Yes. The paper uses examples from both AI systems and human organisations (grant funding, corporate governance). Mode confusion is a universal failure mode of any system that combines reasoning, acting, judging, and redesigning without procedural separation.
Every transition requires an explicit trigger, identification of source and target modes, and a record of the authority under which the transition occurs. This prevents accidental escalation—such as exploratory reasoning being treated as operational instruction.
Explore the complete framework for Mode-Bounded Intelligence, including the formal model, mode-switching rules, and illustrative examples.
View PaperSee the companion paper on how institutions can govern delegated cognition.
Cognitive Operating Architecture