Clustering
How MCPG replicas share state. The cluster coordinator selection (single_node, redis, nats, consul, etcd), the real per-backend config keys, and the primitive-inheritance model that makes single-node work out of the box.
A multi-replica MCPG fleet needs shared state: which replica owns a session, where to fan a
notifications/cancelled, who holds a lease. The top-level cluster block selects one
coordinator backend that provides all of it. This page documents the real per-backend config
keys — the cluster block flows verbatim to the coordinator plugin, which validates it at
boot.
The coordinator model
The cluster coordinator is the unified backbone for multi-instance state and coordination. A single backend internally provides four primitives:
- KeyValueStore — session / task / subscription / pipeline state.
- PubSub — delivery + cancellation fan-out across replicas.
- Lease — leader election and distributed locks (fencing tokens).
- Watch — peer presence and change notification.
Capabilities (sessions, pipelines, tasks, subscriptions, delivery, cancellation) inherit
those primitives by default. That inheritance is the whole trick: because every capability
defaults to "use the cluster coordinator," a single-node deployment works with no extra
config — cluster.kind: single_node installs an in-process coordinator and the capabilities
transparently use its in-memory primitives.
Selecting a backend
cluster:
kind: redis # single_node | redis | nats | consul | etcd
# ...kind-specific keys below
kind | Plugin id | When to pick |
|---|---|---|
single_node (default) | built-in, no plugin | Single instance. In-process; no external dependency. |
redis | dev.mcpg.cluster.redis | You already run Redis. Lowest operational overhead. |
nats | dev.mcpg.cluster.nats | NATS is already in your stack. JetStream KV + pub/sub + leases. |
consul | dev.mcpg.cluster.consul | You run Consul for service discovery / KV. |
etcd | dev.mcpg.cluster.etcd | You run etcd (e.g. alongside Kubernetes' own). |
single_node is the default when cluster is omitted — it ignores every other field in the
block. Any other kind maps to a dev.mcpg.cluster.<kind> cdylib plugin that must be
declared under the top-level plugins[] array. The inline cluster.* fields are the single
source of truth for the coordinator's runtime config — they replace any config: block on the
matching plugins[] entry, so you keep the cdylib location (source.oci / source.path) in
plugins[] and the operational knobs in cluster.
If cluster.kind selects a plugin id with no matching plugins[] entry, the gateway fails
fast at boot.
A note on validation layers
mcpg-config-check validates the gateway's AppConfig shape. The cluster block's
kind-specific fields are passed through as an opaque JSON map and are parsed by the
coordinator plugin at boot, where each plugin enforces deny_unknown_fields. The practical
consequence: a key that mcpg-config-check accepts can still be rejected by the plugin at
startup. Use the exact key names below — they are taken directly from each plugin's config
schema.
Redis (dev.mcpg.cluster.redis)
cluster:
kind: redis
url: "${env.MCPG_REDIS_URL}" # required — redis:// or rediss://
key_prefix: "mcpg:cluster:" # namespace per deployment
lease_ttl_ms: 30000 # default lock / leadership TTL
peer_ttl_ms: 60000 # per-instance peer-presence TTL
| Key | Default | Notes |
|---|---|---|
url | — (required) | redis://… or rediss://…. Any other scheme is rejected. |
key_prefix | mcpg:cluster: | Prepended to every owned key. Give each gateway deployment sharing one Redis a distinct prefix. |
lease_ttl_ms | 30000 | Default TTL for acquire_lock / acquire_leadership when the caller passes none. |
peer_ttl_ms | 60000 | TTL on the per-instance peer-presence key; a missed refresh expires the peer. |
peer_refresh_interval_ms | peer_ttl_ms / 2 | How often the background refresher re-registers the peer. |
lease_renew_before_expiry_percent | 80 | Renewal fires at this fraction of TTL elapsed. Clamped to 1..=99. |
subscribe_pattern_buffer | 256 | Buffer for subscriber + peer-event streams. |
node_id | synthesised | Stable node id. Defaults to service_name-$HOSTNAME, falling back to a random suffix. |
service_name | mcpg | Logical service name; prefix for the synthesised node_id. |
There is no pool_size key on the Redis coordinator — connection pooling is internal.
NATS JetStream (dev.mcpg.cluster.nats)
NATS uses a list of servers and a structured jetstream block — not a single url /
bucket.
cluster:
kind: nats
servers:
- "${env.MCPG_NATS_URL}" # nats:// | tls:// | nats+tls:// | ws:// | wss://
node:
id: "${env.HOSTNAME}" # required — unique per replica
jetstream:
replicas: 3 # 1 for dev, 3 for a real NATS cluster
storage: file # file (durable) | memory
| Key | Default | Notes |
|---|---|---|
servers | — (required) | One or more NATS URLs; the client load-balances + reconnects across them. |
node.id | — (required) | Unique per replica — pin to the pod / hostname. |
node.heartbeat_interval_sec | 10 | Heartbeat publish cadence. |
node.peer_expiry_sec | 30 | Peer reclassified Unreachable after this gap. Must be > heartbeat interval. |
jetstream.replicas | 1 | KV/stream replication factor. Set 3 on a 3+ node NATS cluster. |
jetstream.storage | file | file survives NATS restarts; memory is faster but loses lease state. |
jetstream.leases_bucket | mcpg-leases | KV bucket for leases + locks. |
jetstream.fencing_bucket | mcpg-fencing | KV bucket for fencing-token counters. |
jetstream.notifications_stream | mcpg-notifications | Stream for pub/sub fan-out. |
jetstream.state_bucket | mcpg-state | KV bucket for capability (session/task/…) state. |
jetstream.domain | none | JetStream domain (leaf-node segmentation). |
lease.default_ttl_sec | 30 | Default lease TTL (seconds). |
lease.renew_before_expiry_percent | 50 | Renewal point as a fraction of TTL. Range 1..=99. |
auth | none | { method: token | user_password | credentials_file, … }. |
tls | none | { ca_cert, verify_peer } for tls:// / nats+tls:// URLs. |
connection.connect_timeout_ms | 5000 | Connect deadline. |
connection.operation_timeout_ms | 10000 | Per-operation deadline. |
NATS auth is tagged on method:
cluster:
kind: nats
servers: ["tls://nats.svc:4222"]
node:
id: "${env.HOSTNAME}"
auth:
method: credentials_file
path: "/etc/mcpg/nats.creds"
tls:
ca_cert: "/etc/mcpg/certs/nats-ca.pem"
verify_peer: true
Consul (dev.mcpg.cluster.consul)
Consul uses address (the HTTP API base URL), not url.
cluster:
kind: consul
address: "http://consul.svc:8500" # required — http:// or https://
service_name: "mcpg" # required
kv_prefix: "mcpg/prod/" # distinct per deployment
| Key | Default | Notes |
|---|---|---|
address | — (required) | Consul HTTP API base URL. http:// or https://. |
service_name | — (required) | Name this gateway registers / looks up peers under. |
kv_prefix | mcpg/ | KV path prefix for plugin state. Set distinct per deployment on a shared Consul. |
token | none | Consul ACL token; sent as X-Consul-Token. |
datacenter | agent-local | Cross-DC ?dc= parameter. |
node_id | service_name-$HOSTNAME | Stable id for self-publish dedup. |
subscribe_wait_ms | 30000 | Long-poll wait for the subscribe path. Range 1..=600000 (Consul's 10-minute max). |
lease_renew_before_expiry_percent | 30 | Renewal point as a fraction of TTL. Clamped 1..=99. |
etcd (dev.mcpg.cluster.etcd)
etcd uses a list of endpoints, and key_prefix must end in /.
cluster:
kind: etcd
endpoints:
- "http://etcd-0:2379"
- "http://etcd-1:2379"
key_prefix: "/mcpg/prod/" # MUST end with '/'
| Key | Default | Notes |
|---|---|---|
endpoints | — (required) | One or more etcd endpoints; the client load-balances + retries across them. |
key_prefix | /mcpg/ | Key prefix; must end with /. Set distinct per deployment on a shared etcd. |
event_ttl_ms | 60 | TTL (seconds) for transient pub/sub events. |
lease_renew_before_expiry_percent | 30 | Renewal point as a fraction of TTL. Clamped 1..=99. |
node_id | synthesised | Stable node id. Defaults to a {key_prefix}node-{hostname} value. |
auth | none | { username, password } for Auth-enabled clusters. |
Per-capability overrides
By default every capability inherits the cluster coordinator's primitives — that is what you want for HA, and what makes single-node work with zero config. You can override an individual capability's store or bus, but only to in-process kinds:
mcp:
configurations:
sessions:
store:
kind: cluster # default — delegate to the coordinator (same as omitting)
delivery:
bus:
kind: memory # pin to in-process — single-replica only
| Override kind | store: | bus: | Meaning |
|---|---|---|---|
cluster | yes | yes | Delegate to the coordinator. Identical to omitting the override. |
memory | yes | yes | In-process. Single-replica only. |
file | yes | no | File-backed (dir: required). Single-node persistent. |
redis and nats are not valid per-capability override kinds. To put capability state on
Redis or NATS, set cluster.kind: redis | nats and let the capability inherit (via kind: cluster or by omitting the override). Pinning delivery.bus or cancellation.bus to memory
in a multi-replica deployment silently breaks cross-replica delivery — leave them inherited.
Validating
mcpg-config-check ./config.yaml
# ✓ ./config.yaml: valid (1 bindings, audit on, observability on, cluster: nats)
Remember the two-layer model: mcpg-config-check confirms the AppConfig shape; the
coordinator plugin validates the kind-specific keys (URL scheme, required fields, ranges) at
boot. Use the key names above to avoid a boot-time rejection that the config check can't see.
What's next
- Deployment topologies — single-node vs HA shapes
- Kubernetes operator — Helm auto-wires
cluster.kind - Configuration reference — every config key