Deployment topologies

The four shapes a production MCPG fleet lands on — single-node, single-instance with shared state, multi-replica HA via Redis or NATS, and air-gapped — with the config that distinguishes each.

MCPG ships one binary that scales from a laptop to a multi-replica HA fleet. The shape you run is decided by two config blocks: cluster (which coordinator backs shared state) and governance.access (whether requests are authenticated). Everything else — bindings, audit, observability — is the same across topologies.

Each topology below maps to a canonical template you can generate and validate locally. The templates are kept in lockstep with the live AppConfig schema by CI, so they always boot. For the full key-by-key config reference, see Configuration reference.

Generate a starting template

bash

mcpg-config-init --template <T> --output ./config.yaml
mcpg-config-check ./config.yaml

<T> is one of dev-single-node, production-single-redis, production-redis-cluster, production-nats-cluster, air-gapped, or multi-tenant. Treat each as a layer-zero base — copy and edit, or layer environment overrides via multi-file config:

bash

MCPG_CONFIG=./config.yaml:./local-overrides.yaml mcpg

The override file only declares the fields it changes; the base merges in the rest.

The four shapes

Topology	Template	Coordinator	Auth	Use when
Single node	`dev-single-node`	`single_node` (built-in)	none	Local dev, CI, demos on loopback.
Single instance	`production-single-redis`	`redis`	OIDC	One pod, but state survives restarts and clean drains behind a load balancer.
Multi-replica HA	`production-redis-cluster` / `production-nats-cluster`	`redis` / `nats`	OIDC	N replicas behind an LB; any replica answers any session.
Air-gapped	`air-gapped`	`single_node` (or internal `redis`)	static JWKS	Zero-outbound networks; pre-staged plugins + local IdP.

Single node

The default. cluster.kind: single_node installs the in-process coordinator — sessions, tasks, and pub/sub all live in the gateway's memory. No external dependency, no network hop. This is the right choice for local development and for genuinely single-instance workloads that can tolerate losing in-flight session state on restart.

yaml

gateway:
  server:
    bind_address: "127.0.0.1:8787"
    allowed_origins: []

cluster:
  kind: single_node

mcp:
  capabilities:
    tools:
      - name: dev.mock.echo
        description: Echo a JSON value back as the tool result.
        backend:
          kind: mock
          response: { ok: true }

Anonymous identity is acceptable on loopback only. The moment the listener binds to 0.0.0.0, wire governance.access (see Identity setup).

Single instance with shared state

Same one-pod deployment, but the cluster primitives are externalised to Redis. The win is durability: capability state (sessions, tasks, subscriptions) survives a restart, so the pod can be cleanly upgraded behind a load balancer's connection drain without dropping sessions.

yaml

gateway:
  server:
    bind_address: "0.0.0.0:8787"
    allowed_origins:
      - "https://gateway.example.com"
    tls:
      cert_path: "/etc/mcpg/certs/server.crt"
      key_path: "/etc/mcpg/certs/server.key"

cluster:
  kind: redis
  url: "${env.MCPG_REDIS_URL}"
  key_prefix: "mcpg:prod:"

governance:
  access:
    oidc_oauth:
      providers:
        - issuer: "https://example.okta.com"
          audiences: ["mcpg-gateway"]
          verification:
            kind: oidc_jwks
            allowed_algs: ["RS256"]
  audit:
    enabled: true
    required: true
    on_failure: fail_closed
    sinks:
      - kind: dev.mcpg.builtin.audit.local-file
        config: { path: "/var/log/mcpg/audit.log" }

Capabilities inherit the Redis connection from the cluster block automatically — you do not re-declare per-capability store: / bus: overrides unless you deliberately want a capability to run in-process. See Clustering for the full per-backend key list and the inheritance model.

Multi-replica HA

The production shape. Run N gateway pods behind a load balancer; every replica coordinates through the same external backend, so a session opened on replica A can be answered — and a server-initiated notifications/cancelled delivered — by replica B. Pick Redis or NATS by what your platform already runs:

Redis — lower operational overhead if you already run it. KV + pub/sub + leader election in one component.
NATS JetStream — pick when NATS is already in your stack. JetStream provides KV (state), the pub/sub bus, and leases in one cluster.

What changes from single-instance is that the delivery and cancellation buses inherit the cluster's pub/sub primitive. You do not override them — leaving delivery.bus and cancellation.bus unset is what makes a notifications/cancelled published on one replica reach the active SSE stream on another. Pinning them to kind: memory would silently break cross-replica delivery.

yaml

cluster:
  kind: redis
  url: "${env.MCPG_REDIS_URL}"
  key_prefix: "mcpg:prod:"

# delivery + cancellation intentionally omitted — they inherit the
# cluster's Redis pub/sub so server-initiated messages cross replicas.

On Kubernetes this is a Deployment with replicas: N, an HPA, and a PodDisruptionBudget. The audit ledger is written per-replica to a host-mounted path; a log-shipping sidecar (Fluent Bit, Vector, Filebeat) forwards each replica's file to a central SIEM so the trail is greppable across all pods. See Kubernetes operator for the Helm-driven version and Clustering for the backend config.

Air-gapped

Zero-outbound: no OCI plugin pulls, no remote OIDC discovery, no remote audit sink. Everything that would normally call out is disabled or pointed at a local artefact.

Plugins are pre-staged on disk and referenced via source.path (not source.oci).
Identity uses a static, pre-staged JWKS file rather than remote discovery.
Storage and audit are file-backed.
cluster.kind: single_node avoids network dependencies — swap to an internal redis only if the sealed network runs its own Redis.

For the operator path, the air-gap story extends to an in-cluster OCI mirror (MCPGPluginMirror) and an offline Sigstore trust root — see Kubernetes operator.

Validating before you ship

Every config is checked against the live AppConfig schema:

bash

mcpg-config-check ./config.yaml
# ✓ ./config.yaml: valid (3 bindings, audit on, observability on, cluster: redis)

mcpg-config-check validates the gateway config shape. Per-plugin connection config (Redis URL scheme, NATS server URLs, cluster lease TTLs) is parsed and validated by each plugin at boot — see Clustering for the exact per-backend keys, since the cluster block flows through to the coordinator plugin verbatim.

What's next

Clustering — per-backend coordinator config (Redis / NATS / Consul / etcd)
Observability — metrics, traces, logs, and the binding prober
Kubernetes operator — Helm install + the eight CRDs
Configuration reference — every config key