Developers/Core concepts
Concepts

Core concepts

The Vibe Commerce Protocol (VCP) turns fuzzy buyer and seller intent into structured negotiation, committed offers, platform-verified matches, governed settlement, and audited world-state changes. This page is the conceptual map: the lens, the reasons it exists, and the architecture — namespaces, roles, and the agent model — that the rest of the docs build on.

Architecture at a glance

The environment is three tiers. The World tier is a single World State Service holding catalog, inventory, ledger, reputation, and the audit log. The Agent tier holds two LLM agents — one buyer, one merchant — that transact through a Commerce Intelligence Layer: the self-evolving Platform Service that mediates ranking, payment, reputation, and adjudication. The Harness tier is the runtime that hosts the agents, budgets inference, routes traffic, and writes the audit log, alongside the evaluation harness that grades each run. Every read and write between tiers is one typed envelope — and the four action namespaces those envelopes carry are detailed next.

World tier — one truth of record
World State Serviceworld.*
cataloginventoryledgerreputationordersdisputesrulings

Read-only to agents via WorldTools(caller_id); every write is an action-gated envelope, applied only by the platform.

Protocol envelope (HTTP)every read and write goes here · audited before dispatch
Agent tier — both sides are agents
Buyer agents× network
one per shopper
Skills
intentdiscoverynegotiationauthorization
Merchant agents× network
one per business
Skills
catalogretrievalpricingfulfillmentsupport
↕ matched & mediated by ↕
Commerce Intelligence Layerplatform.*
self-evolving Platform Service · ranks across every merchant in the network
Policies
aggregatorPSPreputationadjudicator
Harness tier — hosts, routes, audits
Runtime
Agent PoolRouterAudit LogInference Layer

Hosts every agent; enforces the partition table + private-utility guard; budgets inference; writes the audit log.

Evaluation Harness

scenario generator → episode runner → state-diff oracles → metrics → causal probes

One typed envelope is the only channel · audited before dispatch · byte-exact replayable from (audit_log, world_seed)

Read this at two scales, because the rest of the docs do. VCP the protocol is scale-free: it specifies how one bilateral deal completes as typed envelopes, and says nothing about how many agents exist. A graded scenario freezes that down to a fixed cast — one buyer, one merchant, the platform service — so behavior is deterministically replayable and gradable. In production, the same protocol runs across a network of millions of trading nodes, where a match is a multi-hop path and every hop along it is one VCP bilateral deal. Matching decides which nodes transact; VCP governs how they transact; the benchmark freezes a slice to score it. The Matching network page is the production scale; everything else on this page is the protocol and its graded slice.

The vibe-commerce lens

Vibe coding turned natural language into a programming interface: a person describes an app idea, and an agent transforms it into code, tests, and a deployment. The hard object is the codebase. Vibe commerceapplies the same move to transactions. A person describes a shopping goal — “a calm, premium home office under $500” — and agents transform it into search, comparison, negotiation, purchase, fulfillment, and support. Here the hard object is not generated code but the transaction state: the ledger entry, the inventory decrement, the order status, the reputation write.

Where vibe coding made English a programming interface, vibe commerce makes English a transaction interface. And it is bilateral. The same framing applies to the supply side: a peer seller who says “sell my used 27-inch monitor — fair price, no scams” is expressing a fuzzy sell-side intent with no SKU, no list price, and no idea what the market will bear. Turning “fair price” into a defensible target band — from recent comparables, depreciation curves, and the platform's price index — is price discovery, the merchant-side analogue of taste grounding on the buyer side. Both sides start fuzzy; both must be grounded.

Because the artifact is world state, the benchmark grades state diffs, not conversation quality. Where vibe coding asks “do the tests pass?”, vibe commerce asks “did the world-state transition correctly satisfy the mandate?” — and the mandate may originate on either side of the trade.

Why VCP exists

Today's commerce benchmarks and protocols are single-sided: a shopper agent against a fixed website, a customer script grading a merchant's policy, or a seller modeled as one bargaining policy with a reservation price — a counterparty, not a business. VCP is built on five commitments that close those gaps:

  • Both sides are agents. Buyer and merchant are both LLM agents under behavioral study — no scripted counterpart, no fixed storefront.
  • The platform itself is an agent stack. Ranking, payment, reputation, and adjudication live in a Commerce Intelligence Layer — an intelligent, self-evolving Platform Service between the two agents. It is a marketplace, not a participant, so it is not graded but is varied to measure how much outcome variance comes from the market versus the agents.
  • The world persists. Inventory depletes, money moves, reputation accumulates. Stockouts, repeat-customer dynamics, and dispute spirals only emerge when the world does not reset between episodes.
  • The protocol is the only source of truth. One typed envelope; all state in the world service. Every behavior is observable, replayable, and gradable from outside — no in-process state inspection.
  • Private utility is protected.A buyer's max budget and a merchant's floor price stay inside the side that owns them. Leakage is a measured research metric, not a footnote — and the structural channel is closed before the metric ever runs.

The four namespaces

VCP varies along the action's meaning, not along who is talking to whom. Boundaries are just (sender_side, receiver_side) pairs the permission matrix allows or denies. The four namespaces are:

human → agentdelegate.*

The human principal delegates authority to their agent — purchase and sell mandates, preference patches, approve / reject gates.

agent ↔ agentcommerce.*

Buyer and merchant agents search, request offers, negotiate, counter, accept, fulfill, and support — the bilateral interaction itself.

platform-onlyplatform.*

Marketplace governance: ranking and claim verification, match certificates, payment authorization and settlement, reputation, and dispute rulings.

scoped reads / writesworld.*

Ground-truth state against the world tables — catalog, inventory, orders, ledger, reputation, disputes, rulings. Writes are restricted; every write materializes a TransactionStateDiff.

The thirteen roles

Each agent is one addressable tenant, but it carries a catalog of named roles that the permission matrix keys against. There are thirteen canonical role IDs across the three sides. The merchant is modeled as a stack of roles, not a single bargainer, so a failure localizes: a bad price is merchant:pricing, a hallucinated attribute is merchant:retrieval, a missing listing is merchant:catalog. You can say which capability failed, not just that “the merchant” failed.

Buyer side (4)

  • buyer:intent — parses the vibe intent into a purchase mandate.
  • buyer:discovery — query formulation, search, candidate ranking, reputation reads.
  • buyer:negotiation — proposes, counters, accepts, or walks away from offers.
  • buyer:authorization — verifies the cart satisfies the mandate before money moves.

Platform side (4)

  • platform:aggregator — ranks offers, verifies claims, issues match certificates.
  • platform:psp — authorizes and settles payment; the only writer of the ledger.
  • platform:reputation — the sole writer of reputation; no agent writes its own.
  • platform:adjudicator — rules disputes; no merchant rules its own dispute.

Merchant side (5)

  • merchant:catalog — listing CRUD against the catalog table.
  • merchant:retrieval — answers inquiries from real catalog and inventory state.
  • merchant:pricing — price discovery, margin discipline, negotiation responses.
  • merchant:fulfillment — dispatch, signed inventory deltas, order-status transitions.
  • merchant:support — returns, refund requests, dispute participation.

The agent model

Agent is a single concrete class — it is never subclassed. A graded scenario freezes the cast to exactly two by default, one Agent("buyer") and one Agent("merchant"), plus the Platform Service, which is not an Agent — so the run is deterministic and gradable. That two-agent count is a property of the scenario, not of the system: in production the same class spawns across a network of millions of nodes (see Matching network). What varies between scenarios is the enabled skill manifests, the loaded skill documents, the bundle versions, and the model configuration — never the class.

Behavior is carried by progressively loaded SKILL.md bundles. At spawn the agent sees only each skill's lightweight manifest (name, description, when_to_use, a digest); when the turn state matches a manifest, the full document is loaded into context just for that turn and then discarded. This lets an agent carry many commerce skills without paying for all of them in context, and makes the skill bundle the one clean ablation unit.

Externally, an agent runs a bounded internal turn loop: one inbound envelope in, zero or one outbound envelope out. Inside that single external turn it may make multiple model calls, load skills, and read or write memory — but only the final outbound envelope changes the bus, and every internal step is traceable for replay and cost accounting. An agent has no handle on the platform, the world, or the runtime: all external effects ride the envelope, which is what keeps the audit log a complete record.

How VCP differs from UCP, ACP, AP2, and MCP

VCP references prior art but does not duplicate it. It owns the bilateral transaction record — negotiation, match certificates, private-utility boundaries, and replayable state diffs — and projects outward to the other protocols where they already solve checkout, authorization, settlement, or tool execution. The contrast below is what is native to each.

VCP
Who is the agent
Buyer + merchant, both under study
Verified-match artifact
MatchCertificate (platform-attested)
Settlement / escrow
Platform PSP, governed atomic ledger write
Budget privacy
Private-utility guard, enforced + measured
Persistent world
Yes — inventory, ledger, reputation accumulate
UCP
Who is the agent
Buyer agent vs. merchant catalog
Verified-match artifact
None
Settlement / escrow
Checkout surface
Budget privacy
Not addressed
Persistent world
No
ACP
Who is the agent
Buyer agent at a merchant checkout
Verified-match artifact
None (merchant cart is authoritative)
Settlement / escrow
Delegated payment + checkout
Budget privacy
Not addressed
Persistent world
No
AP2
Who is the agent
Payment agent acting for a user
Verified-match artifact
None (mandate triad only)
Settlement / escrow
Payment mandate authorization
Budget privacy
Selective disclosure (SD-JWT)
Persistent world
No
MCP
Who is the agent
One model calling tools
Verified-match artifact
None
Settlement / escrow
Out of scope
Budget privacy
Not addressed
Persistent world
No

UCP is Google's Universal Commerce Protocol (commerce interoperability); ACP is the OpenAI + Stripe Agentic Commerce Protocol (delegated checkout); AP2 is Google's Agent Payments Protocol (the Intent / Cart / Payment mandate triad); MCP is the Model Context Protocol for tool calls. VCP borrows ACP's idempotency contract and integer-minor-unit money, adopts AP2's mandate names, projects onto UCP for production interop, and rides an MCP-style tool router underneath — without giving up its own envelope or transaction record.

Status
The protocol and architecture described here are the design-of-record. The SpringBrand runtime that implements them — the agent pool, the Commerce Intelligence Layer, and the evaluation harness — is in private beta.