Focus.AI Press
Cornwall, Connecticut

Issue No. 01
Spring 2026

A Quarterly on Agentic Interfaces · Est. 2026

The Feature · In Six Chapters

After the
Chat Log

Six views on how the agent shows its work — from the treaty rooms of Anthropic and OpenAI to the constrained palette of Vercel Labs and the dissent of Mountain View.

Contents · Spring 2026

In this issue.

§00 From the Editor a box running out of room p. 01
§I The Treaty on MCP Apps p. 04
§II The Garden on MCP-UI p. 09
§III The Storefront on the OpenAI Apps SDK p. 14
§IV The Constrained Palette on json-render p. 19
§V The Native Tongue on A2UI p. 24
§VI The Wire on AG-UI p. 29
§∞ The Stack a synthesis p. 34

From the Editor · Spring 2026

The chat log is a box, and the box is running out of room.

For three years the conversational window has served as the default surface for everything we have asked of large language models. Questions in, tokens out, the rest inferred. It was good enough because we asked it small things.

That arrangement has quietly broken. We are asking agents to file tickets, render dashboards, book travel, draft contracts, model outcomes. A box is the wrong shape for any of it. So six different proposals have arrived in the last six months, each with a different answer to the same question: what should the agent be allowed to show?

This issue is a field guide to those six. Their authors sit at different tables. Two of them — Anthropic and OpenAI — signed a treaty. One of them — Google — walked. One — Vercel — is building a framework around the whole thing. One — CopilotKit — is asking a harder question nobody else has named. And one — MCP-UI — had been doing the work before any of them noticed.

Read them as dialects of the same argument. The surface is where the product finally happens.

— Surface Editorial

§ I On MCP Apps · SEP-1865

The Treaty.

Two laboratories that agree on almost nothing agreed on an iframe. The result is the first official MCP extension, and also its limits.

AnthropicOpenAIMCP-UINov 2025 – Jan 2026

On November 21, 2025, something unusual happened in AI infrastructure. Two of the loudest competitors in the space — Anthropic and OpenAI — co-authored a specification. Alongside the maintainers of MCP-UI, a community project that had been quietly doing the work for over a year, they proposed SEP-1865, the MCP Apps Extension. Two months later it shipped as the first official extension to the Model Context Protocol.

The mechanics are modest on purpose. A server declares UI resources under a new ui:// URI scheme. Tools reference those resources by metadata. When the model calls a tool, the host renders the declared HTML in a sandboxed iframe, and the component talks back to the host using the same JSON-RPC the rest of MCP already uses — carried over postMessage, auditable, loggable. Text-only fallback is required. The initial spec supports one content type: HTML. External URLs, remote DOM, and native widgets are explicitly deferred.

Nothing novel, everything compatible. A boring standard that ships is worth a thousand elegant ones that don't.

This is the treaty's genius and its limit. Iframes are the lowest common denominator every host already understands. Sandboxing is the security story everyone can agree to. JSON-RPC is already carried by every MCP SDK, which means UI components can be built with the same toolchain as servers. Nothing novel, everything compatible. The MCP-UI client SDK remains the recommended host-side framework, so adopters of the earlier pattern don't have to migrate so much as upgrade.

What the treaty doesn't do: solve the interesting problems. It punts on how the UI and the agent coordinate state beyond a tool call. It says nothing about how a second agent in a multi-agent system should read or reason about what's on screen. It assumes HTML — which means a pre-bundled frontend build, or a prompt that produces one.

But a boring standard that ships is worth a thousand elegant ones that don't. Claude, ChatGPT, Goose, and VS Code have shipped support. JetBrains has signaled interest. For the tool author who wants their dashboard to render in every major client without writing four versions, the treaty is the ground under their feet.

§ II On MCP-UI · The Community Working Group

The Garden.

Standards bodies do not invent patterns. They ratify them. Before the treaty, there was a repository — and a Discord channel doing the real work.

Ido SalomonLiad Yosef#ui-wg2024 – present

Before there was a standard there was a repository. MCP-UI — stewarded by Ido Salomon and Liad Yosef, and the moderators of the #ui-wg channel on the MCP Contributors Discord — has been doing the work of interactive UIs in MCP since well before Anthropic and OpenAI sat down to write a spec about it. The design pattern, the bi-directional communication model, the HTML / external URL / remote DOM content types — all of it was incubated in public.

The adoption list reads like a who's-who of developer-tools pragmatists: Postman, Shopify, Hugging Face, ElevenLabs, Goose. These are not early adopters looking for novelty; they are platforms with enough users to care whether something works. That they converged on MCP-UI's approach independently is what made the official extension possible.

The working group did not dissolve when the standard shipped. It became the maintenance body.

The result is an unusual credit line on the official announcement: the authors include MCP core maintainers at OpenAI and Anthropic and the MCP-UI creators, treated as peers. The SEP explicitly names MCP-UI's embedded-resource and custom-message-protocol approaches as predecessor patterns whose message types can be translated to a subset of the new JSON-RPC messages. The extension is not a replacement. Migration is intended to be straightforward. The MCP-UI Client SDK is, per the January 2026 announcement, the recommended framework for hosts adopting MCP Apps.

There is a larger argument here about how infrastructure gets made in this moment. The fashionable take is that AI standards are being wrestled into shape by the largest labs with the largest incentives. The MCP Apps story is the counter-example: a small community project, given time and use, produced the patterns that the large labs needed and then co-authored the ratification. The working group did not dissolve when the standard shipped; it became the maintenance body.

If you are a builder trying to read the ecosystem, the lesson is not that community wins over corporate. It is that patterns earn their shape in production first, and then they get written down. The Garden is still there, and the spec is downstream of it.

§ III On the Apps SDK · OpenAI

The Storefront.

MCP is the backbone; the real product is distribution. Every interoperable standard is also a listings agreement — and this one runs through ChatGPT.

OpenAIChatGPTDevDay 2025Nov 2025

OpenAI's Apps SDK, launched in the same November week as the MCP Apps proposal, is at first glance another iteration of the UI-in-chat idea. Look again and the real product becomes visible: distribution.

MCP is the backbone — tools exposed over JSON-RPC, components declared as embedded resources, the model calling tools during a conversation. All of that is shared with the broader ecosystem. The Apps SDK's bet is what happens on top. Discovery is first-party: the model consumes tool metadata the same way it consumes first-party connectors, which means natural-language discovery and launcher ranking. Conversation awareness is built in: structured content and component state flow through the turn, so the model can refer to IDs, render a component a second time, reason over results. Because MCP is self-describing, the connector works on both web and mobile ChatGPT without client-specific code.

An MCP server wired into ChatGPT is not a tool. It is a listing. The model is the aisle sign.

The word storefront is not in any marketing deck, but it is the honest frame. ChatGPT is a surface with hundreds of millions of users and a model that can act as a recommendation engine. An MCP server wired into that surface is not a tool; it is a listing. The model is the aisle sign.

This explains the peace treaty of November. A standardized MCP Apps extension is strictly better for OpenAI than a fork: the same server that runs in Claude, Goose, and VS Code can be promoted inside ChatGPT with no migration cost to the developer, and — critically — with no migration cost for OpenAI's model, which now reasons about third-party tools the way it reasons about its own. The spec makes the listings interoperable. The storefront is where the attention is.

The open question, which is not an MCP question but a policy question: who gets promoted, and by what rules? Apple spent two decades building the answer for iOS. The conversational surface is ten months into the same journey, and the ranking is currently performed by a language model with opaque priors. Developers who have lived through App Store reviews already know which chapter this is.

§ IV On json-render · Vercel Labs

The Constrained Palette.

Don't let the model write HTML. Let it choose from a catalog. Zod is the guardrail; JSON is the wire format; the renderer is wherever you need it.

Vercel LabsGuillermo RauchApache 2.0Jan 2026

Vercel Labs' json-render, released open-source in January 2026, takes a different view of the iframe question: don't ship an HTML document at all. Ship a JSON specification constrained to a component catalog, and let the host render it natively.

The mechanics are best described as a three-layer stack. The developer defines a catalog — the set of components and actions the AI is permitted to use — in Zod, which doubles as a schema for the LLM and as a type contract for the renderer. The LLM emits a flat JSON tree of typed elements that reference only catalog entries. A Renderer component, in React, Vue, Svelte, Solid, React Native, or others, maps those elements to concrete implementations. There is a shadcn catalog of 36 pre-built components for teams that want a running start. There are renderers for PDF, HTML email, Remotion video, Satori images, and React Three Fiber. One JSON spec, many surfaces.

The framework plugs the AI directly into the rendering layer. The grander claim — that websites might one day compose themselves on request — is the kind of statement that ages either well or very poorly.

Guillermo Rauch, Vercel's CEO, has been unusually direct about the ambition: the framework "plugs the AI directly into the rendering layer." A demonstration on a low-parameter local model was, per his telling, the thing that convinced him.

Two things to notice. First, the catalog-plus-Zod structure is a guardrail against the central risk of generative UI: the model emitting code it should not be allowed to emit. What the LLM generates is data, not executable output. This is the same argument A2UI makes from a different direction (see Chapter V).

Second, json-render and MCP Apps are not competitors. The repository ships an examples/mcp showing a pre-bundled React app served as a single HTML resource via MCP Apps, with json-render running inside the iframe. The spec is the transport; the framework is the payload. If you are in the Vercel ecosystem and want to ship generative UI today, this is the shortest path.

The caveat: json-render is pre-1.0, currently v0.14. The wire format will move. For prototypes, fine. For bets with a long shelf life, budget for migration.

§ V On A2UI · Google

The Native Tongue.

Mountain View's dissent. The iframe is the wrong unit. Send a blueprint of native components and let the host render in its own voice.

GoogleGemini EnterpriseOpalDec 2025

Google's A2UI, announced in December 2025, is the most interesting dissent in this issue. Its premise: the iframe is the wrong unit.

The A2UI argument runs like this. An iframe contains an opaque HTML payload, rendered by code the host did not write, in a sandbox. That sandbox is a security boundary, which is good, but it is also a legibility boundary. The host application cannot style the rendered content to match its own design system. The host's accessibility tree cannot reach inside. Another agent in a multi-agent system cannot parse what is on screen — it can only know that something is. The model loses sight of its own output.

In a world of many agents and many host surfaces, opacity is technical debt. The native tongue is the one everyone in the room already speaks.

A2UI's alternative: send a blueprint of native components. A JSON payload describing which components the client already has should compose, with what props, bound to what data. The client — which could be a React app, a Flutter app, a native iOS app — renders using its own component library. The agent's output inherits the host's styling. Accessibility comes for free from the host's implementation. And another agent, parsing the same payload, can see what was rendered as structured information, not as an opaque frame.

The transport question is deliberately open. A2UI payloads travel over Agent-to-Agent protocol (A2A), over AG-UI (see Chapter VI), over REST in principle. Google has announced a bridge to MCP Apps: rather than serving HTML resources, an MCP server could serve A2UI component blueprints, and the host could render them natively. The version 0.9 draft explicitly enumerates MCP, A2A, AG-UI, SSE, and WebSocket as supported transports. Whether any of them will be production-ready on the timeline A2UI needs is the harder question.

The early surfaces are inside Google — Opal, Gemini Enterprise. The pre-1.0 churn is real: the jump from 0.8 to 0.9 renamed all four message types. A declarative-UI protocol carries the risk that every implementation has to ship the same rename.

But the conceptual move is right. In a world of many agents and many host surfaces, opacity is technical debt. The native tongue is the one everyone in the room already speaks.

§ VI On AG-UI · CopilotKit

The Wire.

Not what to render. How to keep them in sync. The only proposal in this issue that treats state, not surface, as the question.

CopilotKitStateDeltav0.0.452025 – present

If MCP Apps and A2UI argue about what to render and where to draw the sandbox, AG-UI, the protocol incubated by CopilotKit, argues that neither of them is solving the harder problem: keeping the agent and the interface in continuous, bidirectional agreement.

Consider a dashboard generated by a tool call. The user filters by region. The agent needs to know this filter exists, because the next question — "Show me the top customers" — is conditional on it. Under MCP Apps, the UI can call back into the host via tool calls, but that is discrete and turn-based. Under json-render, the UI renders the model's output; the return path is not formally specified. What AG-UI describes is a stream: the agent emits state deltas that the UI applies; the UI emits state deltas the agent consumes. The two share a model.

json-render solves what to render. AG-UI solves how to coordinate. The frameworks and protocols are not in competition; they carry different layers.

The protocol's native primitives are StateDelta messages over WebSocket or Server-Sent Events. The reference framework is CopilotKit's, but the specification is meant to be independent. Pre-1.0, like most of the stack in this issue, but with real adoption among teams who have run into the state-sync problem and decided it deserves its own layer.

The cleanest framing appears in a March 2026 essay analyzing the Declarative Generative UI stack: json-render solves what to render; AG-UI solves how to coordinate. The frameworks and protocols are not in competition. They carry different layers. A production agent interface, at maturity, will have all of them — a declarative payload format (json-render or A2UI), a transport (MCP Apps, A2A, AG-UI), a distribution surface (ChatGPT, Claude, VS Code, a custom host).

The open question for AG-UI is whether it outgrows its origin. A protocol authored primarily by one company — CopilotKit contributes the majority of commits — needs to attract independent implementers to become infrastructure. MCP Apps reached that threshold through the treaty. A2UI is reaching it through Google Enterprise adoption. AG-UI is doing it the hardest way: by being correct about a problem nobody else has named yet.

§ ∞ A Synthesis

The Stack.

Strip the brand names and the affiliations and there are three layers. The question for a builder is which one you want to own.

Strip the brand names and the affiliations and there are three layers.

There is a payload layer — the format of the UI description itself. HTML in an iframe (MCP Apps). Constrained JSON from a catalog (json-render). Native component blueprints (A2UI). The choice determines what the model generates, what the host renders, and who holds the styling.

There is a transport layer — how the payload and its state move. MCP's JSON-RPC over postMessage (MCP Apps). Agent-to-Agent protocol (A2A). Bidirectional state deltas (AG-UI). The choice determines how the agent and the interface stay in sync, and whether the interface can talk back.

There is a distribution layer — the surface the user sees. ChatGPT, Claude, Goose, VS Code, Cursor, a custom host. The choice determines who owns the user.

The chat log is no longer the ceiling. The conversation is now a surface onto which other surfaces can compose.

For a tool author, the practical question is which layer to bet on. A conservative bet: MCP Apps as transport, plain HTML as payload, every host as distribution. Ships today. Works everywhere. Bounded in what it can do.

A more ambitious bet: json-render or A2UI as payload, MCP Apps as transport, a known surface as distribution. Richer UIs, more work, pre-1.0 exposure.

The most ambitious bet: all three layers, with AG-UI or equivalent keeping the state honest. This is the agentic app runtime that the MCP Apps blog post gestured at and then stopped short of describing. It does not exist yet as a unified thing. Someone will ship it.

For the rest of us: the chat log is no longer the ceiling. The conversation is now a surface onto which other surfaces can compose. The question each of these proposals is answering, in its own dialect, is the same one: what should the agent be allowed to show?

The answers differ. That they differ is a sign the question is worth asking.

⁂

After theChat Log