An approval boundary agents can't bypass

The scariest moment in an agent demo is the first time it says “done — I've updated the order.” Once a language model can write to your business data, the usual safety story — a careful system prompt, a polite “only do this if you're sure” — stops being a control. The model is a text generator that can be wrong, jailbroken, or injected. You can't prompt your way out of that.

So in the ERP operations copilot, I drew a hard line: the model can propose a write, but it can never approve one. Approval is not in the agent's toolset at all. The agent has a request_approval tool that only creates a pending request; a human approves it through a separate REST path the agent has no access to; and execution happens against a record the model never controls.

The chain that runs before a write

The interesting part is execution. When an approved write finally runs, the Java MCP server re-validates the whole chain — not because it trusts the agent, but because it trusts nothing:

// Approval is NOT an agent tool. The model requests; a human approves.
// Execution re-validates the entire chain before the write runs:

if (!APPROVED.equals(record.getStatus()))           reject("not approved");
if (record.isExpired())                             reject("expired");        // 15-min TTL
if (!record.getPayloadHash().equals(hash(payload))) reject("hash mismatch");  // integrity
if (!record.getToolName().equals(op.toolName()))    reject("binding");        // no reuse
if (!record.getOperationType().equals(op.type()))   reject("binding");
if (!markConsumed(record))                          reject("already used");   // single-use
revalidatePreconditions(op);                        // stale world -> reject
execute(op);                                        // only now, in a transaction

Each line closes a specific door:

Status & expiry. The approval has to be in the APPROVED state and inside a 15-minute TTL. An approval that's been sitting around, or was never granted, is worthless.
Payload integrity. The payload must be valid JSON whose hash matches what was approved. You can't get a human to approve a small change and then execute a large one — the bytes are pinned.
Binding. The toolName and operation type must match the record. An approval minted for order_update can't be redirected into purchase_order_create. It's also bound to the actor and session that approved it.
Single use. Execution marks the record consumed in the same step. Replaying the call does nothing — the second attempt finds it spent.
Freshness. Preconditions are re-checked at execution time; if the world moved since approval, the write is rejected and a fresh approval is required.

Why split it this way

The model's job is to be useful; the boundary's job is to be paranoid. Keeping those in different processes — Python agent on one side, Java approval executor on the other — means the two concerns can't blur. A prompt injection that convinces the agent to “just run the refund” still hits a wall: it can request approval, but a human never clicked, so nothing executes. A hallucinated tool call produces an approval ID that doesn't validate. A replayed request is already consumed.

“Be careful” is a vibe. A signed, hashed, expiring, single-use, actor-bound approval is a control — and it's the difference between a demo and something you'd let touch production.