Architecture
How MCPG is built — a thin protocol-authority host, an operator-defined backend layer, and a plugin system that loads every cross-cutting concern across a stable binary boundary.
MCPG is a Model Context Protocol Gateway: a protocol-authority gateway that sits between MCP clients (agents, IDEs, orchestrators) and the downstream systems those clients want to reach. It is one Rust runtime — roughly 107,000 lines — that owns the MCP protocol, resolves identity, authorizes every call, dispatches to a backend, and records what happened.
This page is the mental model. For the exact config keys, see the configuration reference. For a plain-language intro, start with What is MCPG?.
The one-sentence version
The gateway core is a thin host. Everything with a policy, an upstream, or a side-effect is a plugin loaded across a stable binary boundary — and the only thing the core insists on owning is MCP protocol legality.
That single design choice explains most of what follows. The gateway does not hard-wire a backend, an authorization engine, or an audit format. It provides a plugin framework (loading, capability enforcement, lifecycle) and a dispatch pipeline (the ordered places where plugins run during a request), and it asserts protocol authority over the wire.
What the core owns
The host keeps a deliberately small set of responsibilities:
- MCP protocol handling — JSON-RPC 2.0 over Streamable HTTP + SSE (and
stdio). The gateway negotiates both the
2025-11-25release revision and theDRAFT-2026-v1draft revision on the same listener. See Protocol versions. - Session lifecycle — creation, SSE streaming, resumption, termination (for the stateful revision; the modern revision is stateless).
- Identity resolution — establishing who the caller is, fail-closed.
- Pre-dispatch authorization — a trust-level floor plus CEL expressions, before any backend runs.
- Capability registration and discovery — the operator-declared tools,
prompts, resources, and resource templates that show up in
tools/list. - Execution dispatch — routing the call to the right backend.
- Pipeline orchestration — multi-step tool calls with suspension and resumption.
- Distributed coordination — session / pipeline / task / subscription state, and delivery / cancellation buses.
- Observability — structured logs, Prometheus metrics, OpenTelemetry traces.
Everything else — the backends, the policy engines, the audit sinks, the identity providers, the payment gates — is a plugin.
The request flow
A single tools/call traverses a fixed pipeline. The shape is the load-bearing
idea; the exact stage list lives in the
security model article.
Client HTTP request
│
▼
HTTP transport parse headers, validate Origin/Accept,
resolve identity (OIDC → JWKS → plugin →
header → anonymous), build RequestContext
│
▼
Gateway runtime session create/load/validate,
route the operation, match the capability
│
▼
Plugin chain (pre) tool-gate chain: policy, payment, guardrails,
rate-limit, custom — first Deny/Challenge wins
│ Allow
▼
Execution dispatcher route to the backend by kind; execute one of
27 backend kinds, or run a multi-step pipeline
│
▼
Plugin chain (post) post-dispatch gates + result transforms
(PII masking, schema migration)
│
▼
Gateway runtime wrap in a JSON-RPC response;
stream via SSE with replayable event IDs
The two plugin chains — pre and post — are where governance happens. A backend never sees a request the policy gate denied.
Three layers, three audiences
1. The protocol layer (the host)
The host is the part that speaks MCP. It validates protocol legality so backends don't have to: a backend proposes an outcome, and the gateway decides whether that outcome is a legal MCP response. This is why MCPG is a gateway and not a proxy — it owns the protocol contract with the client.
2. The backend layer (the product)
Every downstream integration is an operator-defined backend. There is no
code to write for a new integration — you declare a binding in config, and it
surfaces as an MCP tool, prompt, resource, or resource template. Each binding
picks an implementation with a nested backend.kind: discriminator.
There are 27 backend kinds:
- 10 general-purpose —
http,command,nats,grpc,graphql,kafka,sql,openapi,mock, andpipeline. - 17 LLM/media — the
{openai, azure_openai, anthropic, gemini, compat, stability}providers acrosschat,embedding,image,tts, andsttsurfaces (e.g.openai_chat,anthropic_chat,gemini_embedding).
A pipeline backend composes several steps into one tool call, drawing from
18 pipeline step kinds — including suspending steps (elicitation,
sampling, roots_list, gather) that pause the call, ask the client a
question, and resume on the answer. The full taxonomy lives in the
configuration reference.
Every binding carries an MCP descriptor (name, description, input schema),
governance controls (minimum_trust, cel_allow_if), and optional JSON Schema
validation — so authorization and validation are properties of the binding, not
afterthoughts.
3. The plugin layer (the extension surface)
All cross-cutting behavior loads as plugins across a C-stable binary boundary. Backends themselves are plugins; so are policy engines, audit sinks, identity providers, reliability gates, and payment flows. The plugin model — the ABI, the entity kinds, the capability and trust model — has its own page: Plugins and the plugin protocol.
The mcpg binary has no direct dependency on async-nats, rdkafka, or
redis for backend or transport code. Those live entirely in plugin crates,
loaded at startup and dispatched through traits. The gateway ships as a
customizable host, not a monolith.
State and clustering
A single deployment runs on one cluster backend, selected by cluster.kind
(single_node, redis, nats, consul, or etcd). Every stateful capability
— sessions, pipelines, tasks, subscriptions, and the delivery / cancellation
buses — inherits its key-value or pub/sub primitive from that backend by
default. Operators who want finer control can override a single capability's
store or bus without touching the rest.
single_node(the default) keeps everything in memory (or on disk), which is all a single-instance deployment needs.redis/natsadd the shared key-value store and pub/sub that let multiple gateway instances coordinate behind a load balancer.
This is what makes MCPG horizontally scalable without making the simple case heavy. See Kubernetes install for the HA story.
Single deployment, optional fleet
MCPG is a single-deployment product — one gateway instance (or HA cluster) per environment. There is no multi-tenant runtime overlay baked into the core; soft multi-tenancy is an operator-level concern layered on top (see Multi-tenant deployments).
For teams running many gateways, the surrounding surfaces fill the gaps:
- A control plane for fleet management — org/workspace/environment hierarchy, instance enrollment, versioned plugin sets, and a tamper-evident audit ledger.
- A Kubernetes operator with 8 CRDs (
apiVersion: mcpg.dev/v1alpha2) for declarative install and lifecycle. See Kubernetes install.
Design principles
A few decisions recur throughout the system. They're worth holding in mind:
- Backends are the product. Integrations are operator-defined config, not code plugins you write.
- Protocol authority. The gateway owns MCP legality; backends propose, the gateway validates.
- Fail closed. Identity-verification failures, policy denials, and provider errors all deny by default.
- CEL for policy. Authorization runs on CEL expressions — not Rego, not a bespoke DSL.
- Explicit taxonomy. 27 backend kinds and 18 pipeline step kinds are an enumerated, reviewed set — adding one requires a clear reason.
- Cluster-backbone state. One
cluster.kindselects the primitive every capability inherits, with per-capability overrides for the exceptions.
Where to go next
- Governance model — the access → policy → approvals → audit chain every call flows through.
- Plugins and the plugin protocol — how the extension surface actually works.
- Protocol versions — running two MCP revisions on one listener.
- Configuration reference — every key, from the live schema.