The accountability gap
AI agents can now write code, manage infrastructure, and execute multi-step workflows autonomously. But there is no standard way to verify what an agent committed to, require evidence that work was actually completed, or compute trust mechanically rather than accepting self-reported status.
An agent can mark a task as done without proving it. A pipeline can report success without evidence. When agents operate across teams and systems, the gap between claimed work and verified work becomes a structural problem, not a minor inconvenience.
The Mentu Protocol
Today, we are releasing the Mentu Protocol under the MIT license. The protocol is a pure specification for accountable agent action. It defines how observations become obligations, how obligations require evidence, and how trust is computed from that evidence chain.
This is not a product or an SDK. It is a protocol, like HTTP. Any agent, any framework, any language can implement it. The entire specification fits in six documents.
How it works
The protocol is built on a single primitive: the epistemic signal. Observations, commitments, evidence submissions, approvals, and annotations all share one type, one schema, one chain.
Nine operations govern the full lifecycle:
- capture records an observation as evidence
- commit transforms evidence into an obligation
- claim takes ownership of an obligation
- annotate adds progress notes during execution
- submit presents completed work for review
- approve or reopen closes the review cycle
- release returns ownership when an agent cannot finish
- close marks an obligation as fulfilled
Every signal carries the SHA-256 hash of the previous one. The first signal links to 64 zeros. The entire chain is verifiable end-to-end without consensus mechanisms or blockchain infrastructure. Tamper-evident by construction.
State is never stored in a database. It is always computed by replaying the append-only JSONL ledger from genesis. One file. One truth.
Mechanical trust
Trust in the protocol is computed, never self-reported. This is the core design principle: an agent cannot assess the quality of its own work.
Seven weighted signals produce a confidence score between 0 and 1:
| Signal | What it measures |
|---|---|
| Exit code | Did the process complete without error? |
| Test pass rate | What fraction of tests passed? |
| Context utilization | How much of the available context was used? |
| Loop completion | Did the agent complete its full workflow? |
| Duration | Was the execution time within expected bounds? |
| Error absence | Were there unhandled errors or warnings? |
| Evidence depth | How much supporting evidence was attached? |
Evidence decays over time. Fresh proof always outweighs stale claims. The decay function follows an exponential half-life model with a default period of 90 days, ensuring that trust reflects current reality rather than historical performance.
Execution algebra
The protocol defines ten composable primitives for agent orchestration:
Step, Formula, Pipeline, Parallel, Compound, Adversarial, Convergent, Temporal, Sentinel, and Substrate.
These form a closed algebra. Any primitive can embed any other. A step runs identically whether it executes alone or inside a 200-step compound. Adversarial primitives run competing approaches and select the best result. Convergent primitives require independent agents to reach the same conclusion. Sentinel primitives monitor invariants across long-running operations.
This is agent orchestration treated as composable mathematics, not ad-hoc scripting.
The accountability airlock
When an agent submits work, it enters an accountability airlock, a review state that prevents premature closure. The protocol defines three tiers:
- Automated review for low-risk operations where mechanical signals are sufficient
- Async human review for standard work that benefits from human judgment
- Mandatory human review for high-risk changes that require explicit approval
The tier is determined by the operation's risk level, not the agent's self-assessment. An agent cannot bypass the airlock.
Agent-agnostic by design
The protocol works with Claude, GPT, Cursor, Devin, Codex, or any agent that can append to a file. Integration requires dropping a single template into your project's configuration directory. Five CLI commands handle the entire workflow.
No SDK. No runtime dependency. No vendor lock-in.
Open source
The Mentu Protocol v2.0 is released under the MIT license. The repository contains six formal specifications, a drop-in agent template, a sample ledger with real SHA-256 hashes, and a step-by-step workflow example.
Observations become obligations. Obligations require evidence. Trust is computed, never self-reported.