ISSUE № 01 SPRING 2026 VOL. I
US $14 · UK £11 · CH 18F

Runtime

A quarterly about machines that think — and the substrates that hold them.

The Code Mode Issue

Code is the universal medium of action.

RUNTIME № 01 · CONTENTS P. 002
Inside this issue

Contents.

Founded
2026, Cornwall

Editor
The Desk

This issue
Code Mode
Dynamic Workers
Project Think
Agent Memory
  • 01
    The Code Mode Thesis Why the model would rather write code than call your tools.
    p. 004
  • 02
    Inside the Dynamic Worker A V8 isolate, no filesystem, and the end of the container tax.
    p. 010
  • 03
    Project Think The agent as infrastructure — workspaces, sub-agents, a ladder of sandboxes.
    p. 016
  • 04
    By the Numbers 1.17 million tokens down to one thousand. The math of Code Mode.
    p. 022
  • 05
    Memory as Infrastructure When the knowledge your agents accumulate stops being ephemeral.
    p. 026
  • 06
    Field Guide & Glossary A traveler's dictionary for the new agent stack.
    p. 030
RUNTIME № 01 · FROM THE EDITOR P. 003
Editor's note

The agent has always been writing.

There is a convention in this industry, barely two years old and already calcifying, that an AI agent is a thing which calls tools. You give the model a menu: ten functions, fifty functions, five hundred. It reads the menu, picks a line, fills in the parameters, and waits. You run the function. You hand back the result. Repeat until done, or until the context window gives up, whichever comes first.

This picture is already wrong, and what Cloudflare shipped across the autumn and winter of 2025–26 is the clearest evidence yet that the industry has been building around a false primitive. The real primitive was there the whole time. It is called code.

Large language models have seen millions of lines of real production code. They have seen roughly zero lines of your bespoke tool-call schema. Ask which format they are fluent in. Then stop asking them to speak a pidgin.

That is the argument at the center of this issue — and it turns out to have structural consequences that reach far beyond token accounting. If the agent writes code, the code needs somewhere safe to run. If it runs somewhere safe, that somewhere can be anywhere. If it can be anywhere, the agent is no longer tied to a single session or a single laptop. It becomes, as the Cloudflare team puts it in their Project Think announcement, less like a tool and more like infrastructure.

This issue is our attempt to map what that shift looks like from inside the platform that is building the clearest version of it. We are not neutral about the direction. We are, however, trying to be honest about the trade-offs.

— The Desk

RUNTIME № 01 · FEATURE 01 P. 004
Feature 01 · The Thesis

The Code Mode Thesis.

If a language model is a program that completes text, and code is the text it was most carefully trained on, why are we still asking it to fill out forms?

The conventional MCP server, circa late 2024, was a catalog. Each tool a card in the catalog. Each card, consuming some number of tokens in the model's context window every time the model had to reason about what was available. For a small service — five tools, ten tools — this was tolerable. For anything the size of a real cloud platform, it was a catastrophe in slow motion.

Cloudflare has roughly 2,500 API endpoints. Rendered as individual tool definitions, the full Cloudflare MCP server would consume approximately 1.17 million tokens — more than the context window of every frontier model currently in production. No agent could use it. No agent could even load it.

So the team did something that in retrospect looks obvious. They gave the model two tools. One called search. One called execute. The search tool returns relevant slices of the OpenAPI spec on demand. The execute tool runs JavaScript the model writes.

Everything else — pagination, retries, conditional branching, chaining three API calls and returning only the field you need from the third — happens inside that single execution block. The model does not narrate. It codes.

LLMs have seen millions of lines of real-world code but only contrived tool-calling examples. Meet them where they are fluent. — The Code Mode design note

The numbers are almost comic. A fixed footprint of roughly 1,000 tokens, regardless of how many endpoints the underlying API exposes. A 99.9% reduction from the naïve MCP implementation. And — this is the part that keeps getting buried — it works better. Not just cheaper. Better.

The reason is training distribution. Every major frontier model has been trained on an ocean of real TypeScript, real Python, real Go. Tool-call schemas are a dialect the model learned from a few hundred synthetic examples inside a fine-tuning set. Asked to chain five tool calls, the model stumbles. Asked to write a function that makes five API calls and returns a summary, the model is on familiar ground. It is writing the kind of code a mid-career engineer writes on a Tuesday afternoon.

This observation, once stated, starts to feel retroactively inevitable. Anthropic arrived at essentially the same pattern independently, published as Code Execution with MCP. The CodeAct paper had pointed in this direction in early 2024. The convergence is the signal.

What the model sees

The elegant part of the Cloudflare implementation is what the sandbox presents to the model. Not a raw API. Not a tool list. A typed JavaScript environment where every tool is a method on a codemode.* namespace, with TypeScript definitions generated automatically from the underlying tool specs.

Tool names with hyphens or dots — common in MCP — are automatically sanitized into valid identifiers. my-server.list-items becomes my_server_list_items. The $refs in OpenAPI specs are pre-resolved before the spec is ever passed into the sandbox. Authentication lives on the host side, never inside the code the model writes.

None of this is surprising engineering, individually. Taken together it represents a rethinking of what we hand the model at inference time — from a menu of discrete operations to a fluent programming environment with all the affordances a senior engineer would expect.

EXAMPLE · codemode sandbox typescript
// The model writes this. It runs in a sealed V8 isolate.
// No filesystem. No env vars. Outbound fetch disabled.

const zones = await codemode.cloudflare_zones_list();
const problematic = zones.filter(z => z.status !== 'active');

for (const zone of problematic) {
  const rules = await codemode.cloudflare_firewall_rules_list({
    zone_id: zone.id
  });
  if (rules.length === 0) {
    await codemode.cloudflare_ddos_protection_enable({
      zone_id: zone.id,
      mode: 'high'
    });
  }
}

return { enabled: problematic.length };

What you see above is, in one respect, deeply unremarkable: it is a piece of TypeScript that any competent engineer could write in a minute. That is precisely the point. The model, writing this, is not performing an exotic feat. It is doing the thing it is best at.

RUNTIME № 01 · FEATURE 02 P. 010
Feature 02 · The Substrate

Inside the Dynamic Worker.

An isolate, not a container. The difference is the whole business model.

The obvious place to run AI-generated code is a container. Spin up a Linux environment. Install what you need. Execute. Tear down. Every AI sandboxing startup in 2024 and early 2025 was essentially a layer of orchestration on top of this basic loop, and most of them were good, in a tolerable-cost, minutes-to-start, megabytes-of-RAM kind of way.

The problem is the math. Containers take hundreds of milliseconds to boot and hundreds of megabytes to hold. At a consumer scale — one agent per user, and in Project Think's vision, potentially several agents per user — the unit economics collapse. You cannot keep a warm container for every end user of every application. You cannot spin one up on every request. You can cheat by reusing containers across users, and many people do, but then you have traded the security property that was the entire point of the sandbox.

What Cloudflare had, quietly, for years, was a different primitive: the V8 isolate. A Worker is an isolate. It starts in milliseconds. It holds in single-digit megabytes. It was engineered originally as a way to run untrusted third-party code at the CDN edge, which means the security posture is battle-tested in ways most container runtimes are not.

The Dynamic Worker Loader, which shipped alongside the Code Mode announcement, lets a Worker instantiate another Worker on the fly, with code specified at runtime, inside a fresh isolate on the same physical machine. No round trip to a warm pool. No sizing decisions. Whatever region the request landed in is the region the sandbox runs in, microseconds after the model finishes writing.

For consumer-scale agents, where every end user has an agent and every agent writes code, containers are not enough. We needed something lighter. — Dynamic Workers announcement

The sandbox guarantees matter. The default Dynamic Worker has no filesystem. It has no environment variables — a hard guarantee against the prompt-injection pattern of tricking an agent into exfiltrating secrets. Outbound fetch() and connect() are blocked at the runtime level via globalOutbound: null. Anything the code inside the sandbox needs to reach must come through an explicit fetcher handler on the host side. The host keeps the secrets. The sandbox borrows capability, not credentials.

The JavaScript tax

The catch, of course, is that the sandbox runs JavaScript. Workers also support Python and WebAssembly, but for the small snippets a model writes on demand, the load-and-run cost of JS is a fraction of the alternatives. For a human, that is a preference question — plenty of engineers would rather write Python than JS. For a model, the choice is irrelevant. The model is equally fluent in both; the runtime cost is not.

If you wanted to be uncharitable about this, you would point out that Cloudflare has built a product that plays to the precise strength of the runtime they already operate. You would not be wrong. You would just be missing the larger claim, which is that the shape of the agent economy — one instance per user, dormant most of the time, billions of instances at the tail — happens to align very neatly with what V8 isolates do well and what containers do poorly.

It is possible this alignment will turn out to be a coincidence of timing. It is also possible Cloudflare saw this coming.

Architecture · host vs. sandbox
LayerWhat it holdsRuntime
Host WorkerSecrets, auth, fetch handlers, tool implementationsStandard Worker isolate
Dynamic Worker (sandbox)Model-generated code onlyFresh isolate, no FS, no env, no outbound
RPC bridgecodemode.* method callsWorkers RPC
Tool backendMCP servers, REST APIs, internal bindingsHost-side, credentialed
RUNTIME № 01 · FEATURE 03 P. 016
Feature 03 · The Habitat

Project Think.

An ephemeral agent is a tool. A durable agent is infrastructure. Cloudflare is betting the distinction is structural.

The coding agents of 2025 — Claude Code, Codex, Cursor, a dozen others — established a pattern. An LLM with the ability to read files, write code, execute it, and remember what it learned turns out to look less like a developer tool and more like a general-purpose assistant. People started using them for things that had nothing to do with code. Filing taxes. Negotiating purchases. Running entire business workflows.

Everyone who used them seriously ran into the same walls. They live on your laptop. They cost money whether they are working or not. They require manual setup — dependencies, secrets, authentication — every time. And there is a deeper structural issue the industry has been avoiding: these agents are one-to-one. Each one serves a single user on a single task. The multi-tenant economics that made SaaS work do not apply.

A restaurant, the Project Think announcement notes, has a menu and a kitchen optimized to churn out dishes at volume. An agent is more like a personal chef. Different ingredients, different techniques, different tools every time.

Project Think is Cloudflare's answer, and it is a set of primitives rather than a product. The primitives are what they have observed coding agent infrastructure converging on, abstracted one layer up.

An agent that persists — that can wake up on demand, continue work after interruptions, and carry forward state without depending on your local runtime — that starts to look like infrastructure. — Project Think announcement, 2026

The execution ladder

The most interesting idea in Project Think is the notion of an execution ladder — a graduated hierarchy of sandboxes, from cheap and restricted to expensive and powerful, that an agent climbs only when a task demands it. Most operations never leave the lowest rung. A stray task might reach the top. The model chooses where to execute based on what the code needs.

00
Workspace Persistent filesystem. The agent's long-term home. Files survive across sessions and machines.
Durable Object · SQLite
01
Isolate Fast, cheap, stateless code execution. Most operations happen here.
V8 · Dynamic Worker
02
npm Runtime package resolution. The agent installs what it needs when it needs it.
Isolate + resolver
03
Browser Full Chrome DevTools Protocol. The agent clicks, scrolls, screenshots, scrapes.
Browser Rendering API
04
Sandbox Full Linux container for the rare task that needs it. Expensive, avoided by default.
Cloudflare Containers

The rungs are not equivalent, and that is the point. Most agent work — the million routine operations — happens on rung 01, the isolate. Climbing to rung 04 is a last resort, for legacy Linux dependencies or binary tools that cannot be reimplemented in JavaScript. The ladder encodes a cost gradient, but it also encodes a security gradient: the higher you climb, the more surface area you expose.

Sub-agents and the session tree

The second primitive worth attention is sub-agents. An agent in Project Think can spawn children — isolated child agents with their own SQLite, their own scratch space, communicating back to the parent over typed RPC. This is structurally different from the "fan out a prompt across N parallel calls" pattern. Sub-agents persist. They can be resumed. They can be forked.

The session model is similarly tree-structured. Messages branch. Branches can be compacted independently. The whole history is full-text searchable. If this reads like a version-controlled conversation, that is roughly the intended mental model: the agent's interaction history is something you navigate, not something that scrolls off the top of a buffer.

Self-authored extensions

The most speculative primitive, and the one most likely to seem either inevitable or reckless depending on where you sit, is self-authored extensions. Agents in Project Think can write their own tools at runtime — generate a new capability, register it, use it. The tool does not need to exist ahead of time. The agent discovers a need, writes the code, and extends its own surface.

This is the kind of capability that either makes you excited about emergent agent behavior or deeply uneasy about safety. Cloudflare's answer is that the new tool is just more code in the sandbox; it inherits the same restrictions. That is technically true and philosophically incomplete, but it is a more rigorous starting point than most of what has been shipped under the banner of "agent self-improvement" over the past year.

RUNTIME № 01 · BY THE NUMBERS P. 022
The math of Code Mode

One thousand tokens.
Two thousand five hundred endpoints.

Some arithmetic from Cloudflare's announcement, because the numbers are the argument.

99.9%
Reduction in input tokens for the full Cloudflare MCP server compared to the naïve one-tool-per-endpoint approach.
1,000tok
Fixed context footprint of the Code Mode MCP server, independent of how many endpoints the underlying API exposes.
1.17M
Input tokens an equivalent traditional MCP server would consume — larger than every frontier model's context window.
TRADITIONAL MCP · 2,500 endpoints≈ 1,170,000 tokens
CODE MODE MCP · search() + execute()≈ 1,000 tokens

The bar above is not a visualization trick. It is drawn to scale. The sliver of ochre on the second row is what one thousand tokens looks like when one million one hundred seventy thousand occupies the full width.

RUNTIME № 01 · DEPARTMENT P. 026
Department · Memory

Memory as infrastructure.

The part of the stack that gets loudest when it is missing and quietest when it works.

An agent that runs for a single session, against a clean codebase, with a benchmark-sized set of files, can get away with stuffing everything into context. An agent that runs for weeks — against a production system, across interruptions, across model upgrades — cannot. This is the observation that sent Cloudflare's team to build Agent Memory, a dedicated service accessed via Worker binding or REST, engineered for the workloads they kept seeing on their own platform.

The design concept worth noting is the memory profile: a named container of memories that can be attached to an agent, but does not have to be. A team of engineers can share a memory profile across coding agents, so that something one person's agent learned — a convention, an architectural decision, a piece of tribal knowledge — is available to everyone else's agents the next morning. A code review bot and a coding agent can share memory so that review feedback shapes future code generation.

The bet is not subtle. The knowledge your agents accumulate stops being ephemeral and starts becoming a durable team asset. You are accruing institutional memory as a side effect of using the tool, and that memory is portable across agents and across humans.

RUNTIME № 01 · FIELD GUIDE P. 030
Field guide

A traveler's glossary.

The terms you will encounter in Cloudflare's agent documentation, briefly defined.

Agents SDK
The npm package @cloudflare/agents. Base classes and primitives for stateful agents backed by Durable Objects.
AIChatAgent
A subclass of Agent specialized for streaming chat. Messages persist automatically; streams resume on disconnect.
Code Mode
The architectural pattern of giving an LLM a single "write code" tool and a typed API, rather than a menu of individual tool calls.
codemode package
The npm package @cloudflare/codemode. Generates TypeScript definitions from your tools, ships them to the model, runs what it writes in a Dynamic Worker.
Durable Object
Cloudflare's stateful primitive: a single-threaded, consistent, globally addressable object with its own storage. Each Agent is backed by one.
Dynamic Worker
A Worker instantiated at runtime with code specified on the fly. The sandbox substrate for Code Mode.
Dynamic Worker Loader
The API that lets a parent Worker create a Dynamic Worker in a fresh isolate, usually on the same machine.
Execution ladder
Project Think's graduated sandbox hierarchy: workspace → isolate → npm → browser → full sandbox. Climbed only when necessary.
Hibernation
The behavior where a Durable Object (and its Agent) shuts down when idle while preserving open WebSocket connections. Cost is zero when sleeping.
Isolate
A V8 sandbox. Milliseconds to start, single-digit megabytes to hold. The unit of execution in a Worker.
MCP
Model Context Protocol. The standard for exposing tools to LLMs. What Code Mode is a response to, not a rejection of.
Memory profile
A named container of agent memories, potentially shared across agents and across humans. The persistent unit in Agent Memory.
Project Think
The preview-stage npm package @cloudflare/think. A base class and set of primitives for long-running, durable coding agents.
Sub-agent
An isolated child agent with its own SQLite and typed RPC to the parent. The Project Think pattern for delegated work.
Workers RPC
The mechanism by which code inside a Dynamic Worker calls back to the host. Keeps credentials out of the sandbox.
Workspace
Project Think's persistent filesystem primitive. The agent's long-term home, surviving across sessions and machines.

A quarterly about machines that think, and the substrates that hold them. Issue № 01 — The Code Mode Issue — was reported and assembled in the spring of 2026 from Cloudflare's published announcements, documentation, and open-source repositories.

All technical claims have been verified against primary sources. Editorial framing is the magazine's own. No tool calls were harmed in the writing of this issue.

SET IN FRAUNCES & NEWSREADER · CODE IN JETBRAINS MONO
EDITED AT THE DESK · CORNWALL, CT & BEYOND
№ 01 · SPRING 2026 · END OF ISSUE