MCPG
Concepts
Concepts

Architecture

How MCPG is built — a thin protocol-authority host, an operator-defined backend layer, and a plugin system that loads every cross-cutting concern across a stable binary boundary.

MCPG is a Model Context Protocol Gateway: a protocol-authority gateway that sits between MCP clients (agents, IDEs, orchestrators) and the downstream systems those clients want to reach. It is one Rust runtime — roughly 107,000 lines — that owns the MCP protocol, resolves identity, authorizes every call, dispatches to a backend, and records what happened.

This page is the mental model. For the exact config keys, see the configuration reference. For a plain-language intro, start with What is MCPG?.

The one-sentence version

The gateway core is a thin host. Everything with a policy, an upstream, or a side-effect is a plugin loaded across a stable binary boundary — and the only thing the core insists on owning is MCP protocol legality.

That single design choice explains most of what follows. The gateway does not hard-wire a backend, an authorization engine, or an audit format. It provides a plugin framework (loading, capability enforcement, lifecycle) and a dispatch pipeline (the ordered places where plugins run during a request), and it asserts protocol authority over the wire.

What the core owns

The host keeps a deliberately small set of responsibilities:

  • MCP protocol handling — JSON-RPC 2.0 over Streamable HTTP + SSE (and stdio). The gateway negotiates both the 2025-11-25 release revision and the DRAFT-2026-v1 draft revision on the same listener. See Protocol versions.
  • Session lifecycle — creation, SSE streaming, resumption, termination (for the stateful revision; the modern revision is stateless).
  • Identity resolution — establishing who the caller is, fail-closed.
  • Pre-dispatch authorization — a trust-level floor plus CEL expressions, before any backend runs.
  • Capability registration and discovery — the operator-declared tools, prompts, resources, and resource templates that show up in tools/list.
  • Execution dispatch — routing the call to the right backend.
  • Pipeline orchestration — multi-step tool calls with suspension and resumption.
  • Distributed coordination — session / pipeline / task / subscription state, and delivery / cancellation buses.
  • Observability — structured logs, Prometheus metrics, OpenTelemetry traces.

Everything else — the backends, the policy engines, the audit sinks, the identity providers, the payment gates — is a plugin.

The request flow

A single tools/call traverses a fixed pipeline. The shape is the load-bearing idea; the exact stage list lives in the security model article.

text
Client HTTP request
  │
  ▼
HTTP transport            parse headers, validate Origin/Accept,
                          resolve identity (OIDC → JWKS → plugin →
                          header → anonymous), build RequestContext
  │
  ▼
Gateway runtime           session create/load/validate,
                          route the operation, match the capability
  │
  ▼
Plugin chain (pre)        tool-gate chain: policy, payment, guardrails,
                          rate-limit, custom — first Deny/Challenge wins
  │  Allow
  ▼
Execution dispatcher      route to the backend by kind; execute one of
                          27 backend kinds, or run a multi-step pipeline
  │
  ▼
Plugin chain (post)       post-dispatch gates + result transforms
                          (PII masking, schema migration)
  │
  ▼
Gateway runtime           wrap in a JSON-RPC response;
                          stream via SSE with replayable event IDs

The two plugin chains — pre and post — are where governance happens. A backend never sees a request the policy gate denied.

Three layers, three audiences

1. The protocol layer (the host)

The host is the part that speaks MCP. It validates protocol legality so backends don't have to: a backend proposes an outcome, and the gateway decides whether that outcome is a legal MCP response. This is why MCPG is a gateway and not a proxy — it owns the protocol contract with the client.

2. The backend layer (the product)

Every downstream integration is an operator-defined backend. There is no code to write for a new integration — you declare a binding in config, and it surfaces as an MCP tool, prompt, resource, or resource template. Each binding picks an implementation with a nested backend.kind: discriminator.

There are 27 backend kinds:

  • 10 general-purposehttp, command, nats, grpc, graphql, kafka, sql, openapi, mock, and pipeline.
  • 17 LLM/media — the {openai, azure_openai, anthropic, gemini, compat, stability} providers across chat, embedding, image, tts, and stt surfaces (e.g. openai_chat, anthropic_chat, gemini_embedding).

A pipeline backend composes several steps into one tool call, drawing from 18 pipeline step kinds — including suspending steps (elicitation, sampling, roots_list, gather) that pause the call, ask the client a question, and resume on the answer. The full taxonomy lives in the configuration reference.

Every binding carries an MCP descriptor (name, description, input schema), governance controls (minimum_trust, cel_allow_if), and optional JSON Schema validation — so authorization and validation are properties of the binding, not afterthoughts.

3. The plugin layer (the extension surface)

All cross-cutting behavior loads as plugins across a C-stable binary boundary. Backends themselves are plugins; so are policy engines, audit sinks, identity providers, reliability gates, and payment flows. The plugin model — the ABI, the entity kinds, the capability and trust model — has its own page: Plugins and the plugin protocol.

The mcpg binary has no direct dependency on async-nats, rdkafka, or redis for backend or transport code. Those live entirely in plugin crates, loaded at startup and dispatched through traits. The gateway ships as a customizable host, not a monolith.

State and clustering

A single deployment runs on one cluster backend, selected by cluster.kind (single_node, redis, nats, consul, or etcd). Every stateful capability — sessions, pipelines, tasks, subscriptions, and the delivery / cancellation buses — inherits its key-value or pub/sub primitive from that backend by default. Operators who want finer control can override a single capability's store or bus without touching the rest.

  • single_node (the default) keeps everything in memory (or on disk), which is all a single-instance deployment needs.
  • redis / nats add the shared key-value store and pub/sub that let multiple gateway instances coordinate behind a load balancer.

This is what makes MCPG horizontally scalable without making the simple case heavy. See Kubernetes install for the HA story.

Single deployment, optional fleet

MCPG is a single-deployment product — one gateway instance (or HA cluster) per environment. There is no multi-tenant runtime overlay baked into the core; soft multi-tenancy is an operator-level concern layered on top (see Multi-tenant deployments).

For teams running many gateways, the surrounding surfaces fill the gaps:

  • A control plane for fleet management — org/workspace/environment hierarchy, instance enrollment, versioned plugin sets, and a tamper-evident audit ledger.
  • A Kubernetes operator with 8 CRDs (apiVersion: mcpg.dev/v1alpha2) for declarative install and lifecycle. See Kubernetes install.

Design principles

A few decisions recur throughout the system. They're worth holding in mind:

  1. Backends are the product. Integrations are operator-defined config, not code plugins you write.
  2. Protocol authority. The gateway owns MCP legality; backends propose, the gateway validates.
  3. Fail closed. Identity-verification failures, policy denials, and provider errors all deny by default.
  4. CEL for policy. Authorization runs on CEL expressions — not Rego, not a bespoke DSL.
  5. Explicit taxonomy. 27 backend kinds and 18 pipeline step kinds are an enumerated, reviewed set — adding one requires a clear reason.
  6. Cluster-backbone state. One cluster.kind selects the primitive every capability inherits, with per-capability overrides for the exceptions.

Where to go next