Developers/Trust & safety
PROTOCOL SPEC

Trust & safety

Vibe commerce moves money between two LLM agents that never met. The protocol does not ask you to trust either one. Trust is structural: a small set of invariants the runtime enforces on every envelope, so the guarantees hold regardless of how an agent argues.

Each section below is one structural guarantee — what the runtime enforces, and the product-trust claim it lets us make. None of these is a policy an agent could talk its way around; they are gates in the wire path and the world tables.

Private utility never leaks

A buyer's max budget and a merchant's floor price are the two numbers each side most needs to keep private — leaking either hands the counterparty the whole negotiation. The protocol tags these values PRIVATE_UTILITYin the side that owns them. Independent of who is allowed to send what, the runtime'scheck_payloadscans every outbound envelope and rejects any payload that carries a private-utility value from the sender's store before it goes on the wire.

The distinction the check enforces: the existence of a budget or floor is shareable, its integer valueis not. An agent can say "I have a ceiling" and negotiate against it; it cannot put the number in a message.

private-utility tags
# OfferMandate (merchant side)
pricing:
  list_price: 44000        # shareable
  floor_price: 38000       # tagged PRIVATE_UTILITY — never goes on the wire
  floor_currency: "USD"

# PurchaseMandate (buyer side)
authority:
  max_spend_without_confirmation: 20000
  must_not_share_with_merchant:  # tagged PRIVATE_UTILITY
    - reservation_price
    - max_budget

Product-trust claim: your reservation price is yours. A merchant agent can never read the number you would actually pay, because the runtime drops the payload before transmission, not after.

No self-dealing

The dangerous moves in a marketplace are the ones where a party scores its own outcome. Two of them are made structurally impossible by the side permission matrix:

  • No agent writes its own reputation. Only platform:reputation can call world.update_reputation. A merchant cannot inflate its own score; a buyer cannot punish a merchant directly.
  • No merchant rules its own dispute. Only platform:adjudicator can call platform.rule_dispute. The party to a transaction never decides the transaction's outcome.

These are enforced at the table level: the write is action-gated at register time and on every send by the router's allow-table. It is not a guideline the agent is asked to respect — the envelope is rejected before it reaches the world. Structurally impossible, not discouraged.

Product-trust claim: a five-star rating means a real settled transaction, and a dispute is decided by a party with no stake in the result.

Escrow = settlement monopoly

Exactly one role moves money: platform:psp. No buyer and no merchant ever writes the ledger. Settlement is gated on a valid MatchCertificate — the platform-attested object saying this mandate matched this offer truthfully under a named verification policy — and the ledger and inventory writes commit atomically through one transaction. The idempotency_key on the settling envelope makes a retry return the original result instead of charging twice.

01
Authorize
psp checks MatchCertificate
02
Hold
funds reserved, inventory locked
03
Settle
ledger + order written atomically
04
Release
merchant paid on confirmation
05
Dispute → refund
adjudicator-routed reversal

Product-trust claim: money never moves on an agent's say-so. It moves once, on a verified match, through a single auditable account — and a retry or a crash cannot double-charge you.

Auditable & replayable

The runtime appends every envelope to an append-only JSONL audit log before it dispatches the action. Agents never write the log — only the runtime does, via world.write_audit_event — so an agent can neither forge an entry nor skip one. Because the log is ordered and the world is deterministic, a run is byte-exact replayable from the pair (audit_log, world_seed).

Product-trust claim: a dispute points at protocol evidence, not at a chat transcript someone could paraphrase. The same inputs reproduce the same outcome, every time.

Persistent consequences

Trust only matters if it carries cost. The world keeps durable tables that an agent cannot reset by restarting: the ledger is append-only, inventory genuinely depletes when sold, and reputation accumulates across episodes. The ReputationScore row carries rolling_avg, n_settled, and n_disputed — a record a bad actor pays for over time.

append-onlyledger

Receipts are never edited or deleted — only appended. Settlement and refunds are both new rows.

depletesinventory

Reserved on settle, decremented on dispatch. A merchant cannot oversell beyond what exists.

accumulatesreputation

ReputationScore: rolling_avg, n_settled, n_disputed — carried across episodes under the E1 persistent set.

lifecycleorders

State transitions follow the order lifecycle; no skipping straight to shipped or settled.

adjudicateddisputes / rulings

Persistent record of opened disputes and adjudicator rulings, written only by the platform path.

TransactionStateDiff — reputation write
{
  "diff_id": "diff_8821",
  "caused_by": "msg_settle_204",
  "table_writes": [
    {
      "table": "reputation",
      "op": "update",
      "key": { "agent_id": "merchant:atelier" },
      "before": { "rolling_avg": 4.61, "n_settled": 117, "n_disputed": 3 },
      "after":  { "rolling_avg": 4.62, "n_settled": 118, "n_disputed": 3 }
    }
  ],
  "invariants_held": {
    "atomicity": true,
    "idempotency": true,
    "side_partition": true,
    "private_utility": true
  }
}

Product-trust claim: a merchant's history is real history. Good behavior compounds into reputation; a dispute is permanent evidence — neither washes away between sessions.

Why this beats trust the chat
A chat-only agent asks you to believe what it says: that it respected your budget, that the seller is reputable, that the money will move correctly. VCP replaces belief with structure — private utility is dropped at the wire, self-dealing is impossible at the table, money moves only through one verified path, and every step is logged before it runs and replayable after. You do not trust the conversation. You trust the gates.