Logo

What MCP Won't Tell You: Five Primitives for Production MCP Infrastructure

Tiago Gimenes
Tiago Gimenes
April 6, 2026
What MCP Won't Tell You: Five Primitives for Production MCP Infrastructure

MCP gives you three things: Client, Server, and Transport. The spec is clean. The SDK works. But the moment you try to proxy a server, aggregate tools from five upstreams, or sandbox user code - you're writing hundreds of lines of glue that the protocol never anticipated.

Anthropic recently published "Code Execution with MCP", showing that letting LLMs write code against MCP tools inside a sandbox can cut token usage by 98.7%. Compelling architecture. But they published a concept, not a library.

We've been running these patterns in production as part of deco CMS's open-source MCP control plane. We extracted five primitives into @decocms/mcp-utils (source). Here's what the MCP spec leaves out - and the code that fills the gap.

5
Composable primitives
98.7%
Token reduction with sandbox
~20
Lines for a full control plane

The Five Primitives

Each primitive solves one specific problem the MCP SDK doesn't address. They're small, composable, and extracted from production use - not designed in a vacuum.

1

createBridgeTransportPair()

Zero-overhead in-process IPC. Connect a Client and Server in the same process without spinning up sockets or stdio pipes.

2

createServerFromClient()

Turn any Client into a Server. Proxy a remote MCP server, add auth, re-expose it on a different transport, or compose it into a larger system.

3

WrapperTransport + composeTransport()

Transport middleware. Intercept MCP traffic for logging, auth injection, rate-limiting, or rewriting requests with a composable pipeline.

4

GatewayClient

Multi-server aggregation. Collect tools from N upstream servers, namespace them, and route calls to the correct origin.

5

runCodeWithTools()

Sandboxed code execution. Let LLMs write code that calls tools programmatically in a QuickJS sandbox - the pattern behind Anthropic's 98.7% token reduction.


1. createBridgeTransportPair() - Zero-Overhead In-Process IPC

Problem: You have a Client and Server in the same process. The MCP SDK's transports assume network boundaries - stdio, SSE, WebSocket. Spinning up a socket pair for in-process communication is wasteful.

import { createBridgeTransportPair } from "@decocms/mcp-utils";
import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import { Server } from "@modelcontextprotocol/sdk/server/index.js";

const { client: clientTransport, server: serverTransport } = createBridgeTransportPair();

const server = new Server({ name: "my-server", version: "1.0.0" }, { capabilities: { tools: {} } });
const client = new Client({ name: "my-client", version: "1.0.0" });

await server.connect(serverTransport);
await client.connect(clientTransport);

const tools = await client.listTools();
info
Design note

Messages are passed by reference using microtask scheduling - no serialization, no sockets. Microtask scheduling avoids re-entrancy bugs that plague synchronous in-process message passing.


2. createServerFromClient() - Turn Any Client into a Server

Problem: You need to proxy a remote MCP server - add auth, re-expose it on a different transport, or compose it into a larger system. The SDK has no concept of "wrap this client as a server."

import { createServerFromClient, createBridgeTransportPair } from "@decocms/mcp-utils";
import { Client } from "@modelcontextprotocol/sdk/client/index.js";

// Connect to an upstream MCP server
const upstreamClient = new Client({ name: "upstream", version: "1.0.0" });
await upstreamClient.connect(upstreamTransport);

// Expose it as a new server
const proxyServer = createServerFromClient(upstreamClient, {
  name: "my-proxy",
  version: "1.0.0",
});

// Connect the proxy to any transport (SSE, stdio, bridge, etc.)
await proxyServer.connect(downstreamTransport);
info
Design note

Delegates listTools, callTool, listResources, readResource, listPrompts, and getPrompt. Strips outputSchema from forwarded tools - proxies shouldn't validate, that's the origin server's job.


3. WrapperTransport + composeTransport() - Transport Middleware

Problem: You need to intercept MCP traffic for logging, auth injection, rate-limiting, or rewriting requests. The MCP SDK treats transports as opaque - there's no middleware model.

import { composeTransport, WrapperTransport } from "@decocms/mcp-utils";
import type { JSONRPCMessage } from "@modelcontextprotocol/sdk/types.js";

class LoggingTransport extends WrapperTransport {
  protected handleIncomingMessage(msg: JSONRPCMessage) {
    console.log("←", msg);
    super.handleIncomingMessage(msg);
  }

  protected handleOutgoingMessage(msg: JSONRPCMessage) {
    console.log("→", msg);
    return super.handleOutgoingMessage(msg);
  }
}

class AuthTransport extends WrapperTransport {
  constructor(inner: Transport, private token: string) {
    super(inner);
  }

  protected async handleOutgoingMessage(msg: JSONRPCMessage) {
    // inject auth headers, rewrite requests, etc.
    return super.handleOutgoingMessage(msg);
  }
}

// Compose middlewares - messages flow through logging, then auth
const transport = composeTransport(
  baseTransport,
  (t) => new LoggingTransport(t),
  (t) => new AuthTransport(t, "my-token"),
);
info
Design note

Left-to-right composition, like Express middleware. Override handleOutgoingMessage (client → server) and/or handleIncomingMessage (server → client). Helper methods isRequest() and isResponse() for filtering.


4. GatewayClient - Multi-Server Aggregation

Problem: Your agent needs tools from N servers. Loading all tool definitions into the LLM's context bloats token usage - this is exactly the problem Anthropic identified. But before you can optimize tool discovery, you need a way to aggregate and route across multiple upstreams.

import { GatewayClient } from "@decocms/mcp-utils/aggregate";

const gateway = new GatewayClient({
  slack: { client: slackClient },
  google: { client: googleClient },
  github: { client: () => connectToGithub() }, // lazy - connected on first use
});

// List tools from all upstream servers
const { tools } = await gateway.listTools();

// Call a tool - automatically routed to the correct upstream
const result = await gateway.callTool({
  name: "slack_send_message", // namespaced: "{key}_{tool}"
  arguments: { channel: "#general", text: "Hello!" },
});

Per-client allowlists let you control exactly what each upstream exposes:

const gateway = new GatewayClient({
  slack: {
    client: slackClient,
    tools: ["send_message", "list_channels"], // only expose these
  },
  github: {
    client: () => connectToGithub(),
    resources: ["repo://main"],               // only expose this resource
  },
});

Lazy Init

Factory functions called on first use, results cached

Auto-Pagination

Fetches all pages from upstream clients automatically

Namespacing

Tools and prompts prefixed with client key (e.g. slack_send_message)

Routing

callTool/readResource/getPrompt routed to the correct upstream

Caching

List results cached; call refresh() to invalidate

Selection

Per-client allowlists for tools, resources, and prompts

info
Anthropic tie-back

Their "progressive disclosure" file-tree pattern solves tool discovery - how the LLM finds relevant tools. GatewayClient solves tool aggregation and routing - how the infrastructure collects and dispatches across upstreams. Complementary patterns that work together.


5. runCodeWithTools() - Sandboxed Code Execution

Problem: Anthropic's key insight - let LLMs write code that calls tools programmatically, filtering and transforming data in a sandbox instead of burning tokens on multi-turn tool calls. Their blog showed a 98.7% reduction in token usage. We ship this as a function.

import { runCodeWithTools } from "@decocms/mcp-utils/sandbox";

const result = await runCodeWithTools({
  code: `export default async (tools) => {
    const items = await tools.list_items({});
    return items.filter(i => i.status === "active");
  }`,
  client: mcpClient,
  timeoutMs: 5000,
});

console.log(result.returnValue);  // filtered items
console.log(result.consoleLogs);  // captured console.log/warn/error calls

For lower-level control, runCode lets you inject arbitrary tool functions:

import { runCode } from "@decocms/mcp-utils/sandbox";

const result = await runCode({
  code: `export default async (tools) => {
    const data = await tools.fetch_data({ query: "active" });
    console.log("Found", data.length, "items");
    return data;
  }`,
  tools: {
    fetch_data: async (args) => fetchFromDatabase(args.query),
  },
  timeoutMs: 10_000,
  memoryLimitBytes: 16 * 1024 * 1024, // 16 MB
  stackSizeBytes: 256 * 1024,          // 256 KB
});
info
Design note

QuickJS over V8 isolates. Compiles to WASM, runs anywhere (Node, Deno, edge workers, browsers). Deterministic memory limits with no FFI escape hatches. The sandbox literally cannot access the host beyond the tool functions you inject.


Composition - Where It Clicks

The five primitives are designed to snap together. Here's a full MCP control plane in ~20 lines:

import { GatewayClient } from "@decocms/mcp-utils/aggregate";
import { createServerFromClient, composeTransport, createBridgeTransportPair } from "@decocms/mcp-utils";

// 1. Aggregate multiple upstreams
const gateway = new GatewayClient({
  slack: { client: slackClient },
  github: { client: () => connectToGithub() },
  db: { client: dbClient, tools: ["query", "list_tables"] },
});

// 2. Expose as a server
const server = createServerFromClient(gateway, {
  name: "my-gateway",
  version: "1.0.0",
});

// 3. Add middleware and connect
const transport = composeTransport(
  baseTransport,
  (t) => new LoggingTransport(t),
  (t) => new AuthTransport(t, userToken),
);

await server.connect(transport);

GatewayClient aggregates tools from three upstreams. createServerFromClient wraps the gateway as a standard MCP server. composeTransport layers on logging and auth. The result is a full MCP gateway with auth, observability, and multi-server routing - built from five composable pieces.


How This Relates to Anthropic's Architecture

close

Anthropic's approach (agent-centric)

  • Generate file-tree of TypeScript stubs for tool discovery
  • LLM reads stubs to discover capabilities
  • LLM writes code against the tools
  • Sandbox executes the generated code
  • LLM orchestrates everything through generated code
check

@decocms/mcp-utils (infrastructure-centric)

  • GatewayClient aggregates and routes across upstreams
  • Transport middleware handles auth, logging, rate-limiting
  • Sandbox provides the code execution layer
  • LLM just calls tools or writes sandbox code
  • Infrastructure does the heavy lifting

These approaches are complementary, not competing. You could use @decocms/mcp-utils to build the infrastructure that Anthropic's architecture sits on top of: GatewayClient aggregates your upstreams, transport middleware adds auth and observability, and runCodeWithTools provides the sandbox execution layer their architecture requires.


Get Started

MIT licensed. Extracted from production use in deco CMS's MCP Mesh - an open-source control plane for managing AI agent access to tools at scale.

npm install @decocms/mcp-utils @modelcontextprotocol/sdk

For sandbox support:

npm install quickjs-emscripten-core @jitl/quickjs-wasmfile-release-sync

MCP is a protocol, not a framework. These are the missing pieces.

Stay up to date

Subscribe to our newsletter and get the latest updates, tips, and exclusive content delivered straight to your inbox.

We respect your privacy. Unsubscribe at any time.

You might also like

See all