Back to Blogs

AI and MCP

What Is an MCP Gateway? Architecture, Use Cases, and How to Choose One (2026)

written by
Dhayalan Subramanian
Associate Director - Product Growth at DigitalAPI

Updated on: 

June 3, 2026

Blog Hero Image
TL;DR

An MCP gateway is a reverse proxy that sits between your AI agents and one or more MCP servers
, giving you a single secure entry point with OAuth 2.1, tool-level RBAC, session affinity, SSE streaming, and centralized audit logs. You need one the moment you cross two MCP servers, multiple agents, or any regulated workload, because the Model Context Protocol spec itself enforces no governance, no rate limits, and no audit trail.

Open-source options like Microsoft MCP Gateway, Docker, and Lasso handle routing. Commercial gateways like DigitalAPI, Kong, Tyk, and Gravitee add policy, OpenAPI-to-MCP conversion, multi-tenant isolation, and production-grade latency. Skip the gateway for single-tenant prototypes. Everything past that, the gateway becomes the control plane that lets MCP exist in production.

What is an MCP gateway?

An MCP gateway is a reverse proxy that sits between AI clients (agents, IDEs, copilots) and one or more Model Context Protocol servers. It centralizes authentication, routing, session management, rate limiting, observability, and policy enforcement. AI agents access tools through a single governed endpoint instead of connecting to each MCP server directly.

The MCP gateway pattern emerged after Anthropic published the Model Context Protocol in November 2024. The protocol gave large language models a standard way to call external tools, query resources, and execute prompts. It did not give enterprises a way to govern that access.

That gap became obvious within months. By mid-2025, Microsoft, Docker, AWS, Kong, Tyk, Gravitee, TrueFoundry, and a wave of startups all shipped MCP gateway products. Each one solves the same shape of problem: an MCP server alone runs anywhere, but production deployments need centralized control.

The simplest mental model: an MCP gateway is to MCP servers what an API gateway is to REST microservices. It sits in front of the backend, enforces policy, and gives clients one URL.

Definition block

An MCP gateway is a reverse proxy and management layer that mediates traffic between AI clients and Model Context Protocol servers. It enforces authentication, authorization, routing, rate limiting, session affinity, transport translation, and audit logging.

For a deeper view of the protocol itself, see What is Model Context Protocol? and MCP architecture explained.

Do you need an MCP gateway?

You need an MCP gateway the moment you run more than one MCP server, support more than one AI client, or operate in a regulated environment. Single-server prototypes can skip it. Everything past prototype, the gateway becomes the control plane that makes MCP usable in production.

The MCP spec, by design, is minimal. It describes how a client and server exchange JSON-RPC messages over stdio or HTTP/SSE. It stops there.

Here are the five things the MCP spec does not do

  • Authorization across tools: MCP defines authentication at the server level. It has no concept of per-tool or per-resource policy. An agent that can connect can call every tool.
  • Rate limiting: The spec has no quotas, no back-pressure, no token-budget enforcement. A misbehaving agent can run a tool in a loop until the backend falls over.
  • Centralized audit logs: Each MCP server logs locally. There is no protocol guarantee that audit events flow to one SIEM.
  • Multi-server aggregation: An AI client sees one server at a time. There is no canonical way to combine tools/list across five servers into one response.
  • Multi-tenant isolation: The spec has no notion of tenants. SaaS providers wanting per-customer scoping must build it themselve
    An MCP gateway exists to fill each of those five gaps. If your deployment hits any one of them, the gateway stops being optional.

You need one if:

  • You operate two or more MCP servers: the aggregation gap shows up immediately.
  • You connect two or more AI clients (Claude Desktop, Cursor, your own agent): policy has to live in one place.
  • You have any compliance requirement (SOC 2, HIPAA, PCI DSS, PSD2): audit trails are not optional.
  • You need to attribute tool calls to a user, not just an agent: on-behalf-of identity needs a gateway.
  • You expose MCP tools to external partners or customers: isolation requires a control plane.

You can skip it if:

  • You run a single MCP server for a single agent in development: the gateway adds more setup than value.
  • You operate entirely on a local CLI with no external traffic: stdio direct is fine.
  • Your latency budget is under 50 ms end-to-end: every hop counts.

What does an MCP gateway actually do?

An MCP gateway provides eight production-grade capabilities: protocol translation, routing, authentication and authorization, rate limiting, caching, aggregation, observability, and lifecycle governance. Together, these turn MCP from a developer protocol into operational infrastructure.

1. Protocol translation and transport adaptation

MCP clients speak stdio, HTTP, or Streamable HTTP. MCP servers may speak any of those. The gateway bridges between transports so a client using stdio can still reach a remote HTTP-SSE server. Server-Sent Events streaming forces the gateway to handle keepalive, reconnection, and back-pressure on behalf of the AI client.

2. Routing and server discovery

When an AI client calls tools/list, the gateway queries every registered MCP server, deduplicates tool names, and returns one composite list. When the client then calls tools/call, the gateway routes the invocation to the right server based on tool name, agent identity, or tenant. Microsoft's open-source MCP Gateway calls this "session-aware stateful routing".

3. Authentication and authorization

The gateway verifies client identity using OAuth 2.1, JWT, mTLS, or API keys. Once identified, the gateway applies tool-level access policy. A finance agent can call query_invoice but not delete_record. A support agent can call lookup_customer but not export_customers.

4. Rate limiting and quota enforcement

The gateway enforces requests per second per agent, per tenant, per tool, or per LLM token budget. Without this, one runaway agent saturates the backend. With this, you can sell tiered MCP access to customers.

5. Caching

tools/list and resources/list are idempotent reads. Caching them at the gateway eliminates 60 to 80 percent of upstream calls in typical agent loops. The gateway must invalidate aggressively when tools change, but the read amplification is real.

6. Aggregation and composition

The gateway composes responses across servers. An agent calling tools/list sees one merged list across analytics, storage, and CRM servers, not three separate calls.

7. Observability

Every tool call generates a span. The gateway emits structured logs, distributed traces (OpenTelemetry), and metrics. You see which agent, on whose behalf, called which tool, with which arguments, with what result, with how much latency.

8. Lifecycle governance

The gateway controls which MCP servers are registered, which versions are active, which are deprecated, and which require security review before exposure. Without this, MCP server sprawl becomes the AI equivalent of shadow IT.

How does an MCP gateway compare to other gateway types?

An MCP gateway differs from a traditional API gateway in three concrete ways: it parses JSON-RPC instead of REST, it maintains session state across multiple requests, and it streams SSE responses. An AI gateway sits at the LLM layer, where an MCP gateway sits at the tool layer.

The capability matrix:

Capability API Gateway AI Gateway MCP Gateway
Routes REST requests Yes Yes Partial
Routes JSON-RPC traffic No No Yes
Routes between agents and LLMs No Yes No
Routes between agents and tools No Partial Yes
Server-Sent Events (SSE) streaming Partial Yes Yes
Session affinity (sticky sessions) Optional No Required
Tool-level RBAC No Partial Yes
tools/list aggregation No No Yes
Prompt injection defense No Yes Yes
OAuth 2.1 support for AI clients Partial Yes Yes

The three gateway types stack, they do not replace each other. A mature AI infrastructure runs an API gateway at the edge, an AI gateway in front of model providers, and an MCP gateway in front of tool servers. Cloud providers are starting to converge these. AWS added MCP proxy support to Amazon API Gateway in December 2025. Azure API Management exposes REST APIs as MCP servers through its built-in AI gateway.

For a side-by-side teardown, see MCP vs API gateway.

What about MCP gateway versus MCP server?

An MCP server exposes tools. An MCP gateway sits in front of one or more MCP servers and manages access to them. You can run an MCP server without a gateway. You cannot run a production MCP fleet without one.

How do you choose between open source and commercial MCP gateways?

Open-source MCP gateways (Microsoft MCP Gateway, Docker MCP Gateway, Lasso, Obot, IBM ContextForge) cover routing, basic auth, and session handling, while commercial gateways (DigitalAPI, Kong, Tyk, Gravitee, TrueFoundry, Portkey, Runlayer, and OpenAPI-first API Management platforms) add multi-tenant policy, OpenAPI conversion, audit-grade logging, and production support.

Choose open source if your security team can own operations. Choose commercial if you need vendor accountability for compliance.

The decision matrix:

Pick Open Source If... Pick Commercial If...
You have a platform engineering team that owns infrastructure and operations You need vendor SLAs for uptime, reliability, and security incident response
Your compliance posture is managed internally and approved through internal audit processes You need SOC 2, HIPAA, PCI DSS, or similar compliance attestations on the gateway platform itself
MCP is one project among many and not a dedicated business-critical initiative MCP traffic is mission-critical and directly impacts revenue or customer operations
You want maximum control over policy logic, deployment patterns, and customization You want OpenAPI-to-MCP conversion, governance, and operational tooling available out of the box
Cost is the primary constraint and engineering time is available Time-to-production is the primary constraint and faster delivery is required

A quick survey of the 2026 field:

Open source:

  • Microsoft MCP Gateway: Kubernetes-native, session-aware routing, lifecycle management
  • Docker MCP Gateway: Container-isolated MCP servers, simple compose deployment
  • Lasso: Security-focused, prompt-injection detection, tool description scanning
  • Obot: Lightweight, multi-tenant
  • IBM ContextForge: Protocol flexibility, enterprise patterns
  • MCPX (Lunar.dev): Open-source observability and routing

Commercial:

  • DigitalAPI: Hybrid or fully self-hosted, OpenAPI-to-MCP conversion in one click, audit-grade governance
  • Kong MCP Gateway: Kubernetes-native, extends the broader Kong API platform
  • Tyk: Self-host friendly, strong regulated-industry positioning
  • Gravitee: API platform integration, caching focus
  • TrueFoundry: AI infrastructure platform
  • Portkey: AI gateway with MCP support
  • Runlayer: Cloud-hosted with built-in threat detection
  • OpenAPI-first platforms (like the one detailed at the end of this guide): one-click conversion of existing REST API specs into MCP servers, with the gateway managing policy, tenancy, and audit

The choice is not binary. Plenty of teams start with Docker or Microsoft MCP Gateway in development, then move commercial when production traffic justifies the spend.

The choice is not binary. Plenty of teams start with Docker or Microsoft MCP Gateway in development, then move commercial when production traffic justifies the spend.

For a curated comparison, see Best MCP gateways for 2026 (publishing soon).

What are the most common MCP gateway use cases?

Four MCP gateway use cases dominate in 2026: internal AI assistants accessing employee tools (HR, finance, ITSM); multi-tenant SaaS exposing MCP to customers; regulated industries (banking, healthcare, insurance) demanding audit trails; and developer platforms unifying scattered MCP servers into one catalog.

1. Internal AI assistants accessing employee tools

A 5,000-person enterprise deploys an internal AI assistant. The assistant connects to the HR system to look up vacation balances, the finance database to check expense reports, and the ITSM platform to file tickets. The MCP gateway authenticates the employee through SSO (Okta, Microsoft Entra ID), routes the agent's calls to the right backend, and logs every action with the employee's identity attached.

Without the gateway, you would embed identity logic into every MCP server. With the gateway, policy sits in one place.

2. Multi-tenant SaaS exposing MCP to customers

A B2B SaaS company offers AI agent features to customers. Each customer's agent needs access to that customer's data, isolated from every other customer. The gateway uses tenant-scoped OAuth tokens, dynamically routes to per-tenant tool servers, enforces usage quotas based on plan tier, and provides per-tenant analytics for billing.

This pattern only works with a gateway. Direct MCP server access cannot enforce tenant isolation at scale.

3. Regulated industries

A bank deploys MCP for an internal risk-analytics agent. The compliance team requires every tool call to be logged, every credential rotated quarterly, and every external partner integration scoped to specific endpoints. The gateway provides audit-grade logs the compliance team needs, plus tool-level scopes that satisfy partner contracts.

For a deeper view on regulated-industry patterns, see Developer portals for regulated industries.

4. Developer platforms unifying MCP servers

An engineering platform team wants one MCP catalog across the company. Marketing teams own their MCP servers, sales teams own theirs, and the platform team curates a single front door. The gateway provides server discovery, registry, version control, and a deprecation workflow.

This is the MCP equivalent of an internal developer portal.

How fast is an MCP gateway, and what does it cost to run?

A well-engineered MCP gateway adds 5 to 30 milliseconds of latency at p99 under production loads of several thousand requests per second. Cost depends on deployment model: open-source self-hosted is compute-only; commercial gateways price per request, per tool, or per agent.

When an agent calls a tool through an MCP gateway, the total latency stack breaks down as follows:

Layer Typical Contribution
LLM inference (the agent's decision-making step) 200 to 2,000 ms
MCP gateway routing and policy enforcement 5 to 30 ms
Network latency between the gateway and MCP server 1 to 20 ms
Tool execution (backend processing and business logic) 5 to 500 ms
Response streaming back to the AI agent 1 to 20 ms

The gateway is rarely the bottleneck. Inference dominates by an order of magnitude. The reason latency still matters: a gateway that adds 100 ms instead of 30 ms accumulates across multi-step agent loops. A 10-step task with a 100 ms gateway costs 1 second of overhead. The same task with a 30 ms gateway costs 300 ms.

Cost models in 2026:

  • Open source self-hosted: Compute, storage, and ops time. A medium-scale deployment runs $500 to $3,000 per month in cloud infrastructure.
  • Commercial per-request: Typical range $0.0005 to $0.005 per gateway call. Adds up at LLM-agent scale.
  • Commercial per-server or per-agent: Subscription pricing tied to MCP servers connected or agents authenticated. Predictable, scales with deployment size, not call volume.
  • Cloud-platform-native (AWS, Azure): Usage-based, bundled with broader API Management platform cost.

For teams running production MCP traffic, the gateway should be benchmarked against your actual workload, not vendor marketing numbers. Demand p99 and p999 figures at your expected request volume, not p50 at vendor-test volume.

What are MCP gateway security best practices?

Seven MCP gateway security practices cover most of the attack surface: enforce OAuth 2.1 with short-lived tokens; apply tool-level RBAC; use on-behalf-of authentication so the gateway never holds super-user permissions; require human-in-the-loop for high-risk actions; scan tool descriptions for prompt injection; log every call with agent identity attached; and treat new MCP servers as untrusted code.

1. OAuth 2.1, not static API keys

The MCP spec settled on OAuth 2.1 for authentication. Static API keys are hard to rotate and impossible to scope to individual users. Adopt OAuth 2.1 with PKCE.

2. Tool-level RBAC, not server-level

Restricting an agent to all tools on server X is too coarse. Restrict per tool, per parameter. The OWASP LLM Top 10 (2025) lists excessive agency as a top risk for AI deployments.

3. On-behalf-of authentication

The gateway should pass through the original user's identity to the MCP server, not impersonate with a super-user token. The backend's existing authorization logic still applies, and audit trails attribute actions to the real user.

4. Human-in-the-loop for high-risk actions

Sending external email, transferring funds, modifying production data: these should require explicit approval. The gateway is the right place to insert the checkpoint, because no agent should be able to bypass it.

5. Prompt injection scanning

Tool descriptions and resource content are LLM-readable strings. Adversaries can hide instructions in them. The gateway should scan for injection patterns at registration time and again at runtime.

6. Audit logs with identity propagation

Every tool call should log the agent ID, the user ID, the tool name, the arguments, the result, and the latency. Send these to your SIEM. Without identity propagation, you have logs that prove something happened but not who caused it.

7. Treat new MCP servers as third-party code

Verify signatures, pin versions, sandbox new servers, and gate them through security review before exposing them to agents. MCP server sprawl creates the same supply-chain risk as npm package sprawl.

For a deeper security view, see How MCP secures your data and How to secure MCP endpoints.

Introducing DigitalAPI's MCP Gateway

DigitalAPI's MCP Gateway is a governed runtime and control layer between MCP clients, agents, and enterprise tools. It lets internal teams register private MCP servers and approved OpenAPI/REST APIs, expose them through governed MCP endpoints, enforce user, agent, and tool-level policy, broker credentials, route to private backends, support stateful sessions, and audit every call. One governed root of trust for approved MCP traffic, deployable in hybrid or fully self-hosted mode.

Deployment: hybrid or fully self-hosted

Most enterprise buyers will not route private tool traffic or control-plane metadata through a vendor SaaS during early MCP adoption. DigitalAPI's Enterprise MCP Gateway ships in two modes that reflect that reality:

  • Hybrid: The managed control plane handles admin console, registry metadata, policy management, and configuration distribution. The customer-side data plane handles MCP request routing, policy enforcement at runtime, credential resolution, private connector handling, session routing, and audit event generation. Private payloads stay inside the customer environment.
  • Fully self-hosted: The entire product (control plane, data plane, registry, policy store, credential broker, audit) runs inside customer-controlled Kubernetes. No required outbound vendor dependency for normal runtime operation. Customer-managed PostgreSQL and Valkey, customer-managed secret manager, customer-owned audit export.

Both modes use Helm-first packaging, signed container images, SBOMs, OpenTelemetry, and customer-managed PostgreSQL plus Valkey as the V1 default stack. No required ClickHouse, NATS, or Kafka dependency in the default install.

What the gateway runtime does

  • Registers private MCP servers and approved OpenAPI sources: Every backend is owner-tagged, environment-scoped, risk-labeled, and policy-attached before agents see it. Default deny on everything not approved.
  • Converts selected OpenAPI 3.x operations into MCP tools: Import a spec, select the operations to expose, map auth and risk tier, get a governed MCP tool without writing a custom server. The API-to-MCP adapter enforces host allowlists, schema validation, timeouts, and size limits.
  • Filters tool discovery by policy: Unauthorized tools are hidden during tools/list, not just denied at execution time. Agents see only what they are allowed to call.
  • Enforces deterministic policy with Cedar: Principal, action, resource, and context map cleanly to users, agents, sessions, tools, environments, credential modes, and delegation context. Policies version, simulate, and revoke immediately.
  • Brokers credentials without exposing them: Service-account, user-delegated OAuth, agent-scoped, and workload-identity-mapped modes. Secrets never appear in logs, traces, audit events, or metrics. Vault-compatible credential broker with adapters for cloud and customer secret stores.
  • Routes privately: Direct private endpoint routing or outbound-only connector mode with mTLS. Private MCP servers and internal APIs stay inside customer networks.
  • Maintains stateful MCP sessions: Session affinity, externalized session metadata, graceful drain, basic reconnect where supported, and immediate revocation propagation that terminates affected sessions on policy or credential change.
  • Audits every call: Tool call, policy decision, credential mode, session lifecycle event, admin change, and revocation. Audit exports to customer SIEM with no payload leakage and full actor chain.

Who DigitalAPI's Enterprise MCP Gateway is built for

  • Security admins: Default-deny access, server, tool, and agent-level RBAC plus ABAC, clear deny reasons, policy simulation, emergency disable across servers, tools, agents, credentials, connectors, and client surfaces.
  • Platform engineers: Helm-based install, signed images, OpenTelemetry traces and metrics, connector status, documented backup, restore, upgrade, and graceful-drain runbooks. Hybrid and fully self-hosted deployment paths.
  • MCP server owners: Simple registration, manifest validation, OpenAPI import workflow, policy attachment, credential binding, and usage visibility, all behind an approval workflow.
  • Agent developers: Stable gateway endpoint, policy-filtered discovery, agent registration, client configuration examples, and machine-readable deny reasons that explain exactly what was blocked and why.

See how DigitalAPI's  MCP Gateway runs in your infrastructure (hybrid or fully self-hosted) → Book a demo

FAQs

1. What is an MCP gateway in simple terms?

An MCP gateway is a reverse proxy between AI agents and Model Context Protocol servers. It centralizes security, routing, observability, and policy. Think API gateway for AI agent tool calls.

2. Do I need an MCP gateway if I run only one MCP server?

No. A single-server, single-client setup runs fine without one. Add a gateway when you cross two servers, two clients, or any compliance requirement.

3. What is the difference between an MCP gateway and an MCP server?

An MCP server exposes a set of tools, resources, and prompts to AI clients. An MCP gateway sits in front of one or more MCP servers, mediating access, applying policy, and logging activity. The gateway also aggregates tools/list across servers so the agent sees one merged catalog.

4. Is an MCP gateway the same as an API gateway?

No. An MCP gateway parses JSON-RPC, maintains session state, streams SSE responses, and aggregates tools across servers. Most API gateways do none of these. Cloud providers (AWS, Azure) have added MCP proxy support on top of their API gateways to close the gap.

5. How much latency does an MCP gateway add?

A production-grade MCP gateway adds 5 to 30 ms at p99 under typical loads. The latency tax matters because it accumulates across multi-step agent loops. LLM inference still dominates the end-to-end budget.

6. What is the best open-source MCP gateway in 2026?

It depends on the workload. Microsoft MCP Gateway is strong for Kubernetes deployments. Docker MCP Gateway is the cleanest for container isolation. Lasso leads on security-focused tool inspection. Obot, IBM ContextForge, and MCPX cover broader patterns.

7. Do MCP gateways support OAuth 2.1?

Yes. OAuth 2.1 is the standard authentication for MCP clients in 2026. Most commercial gateways (Kong, Tyk, Gravitee) and several open-source options (Microsoft MCP Gateway, Lasso) implement it natively.

8. Can I expose my existing REST APIs through an MCP gateway?

Yes. The standard pattern: convert each REST API into an MCP server using your OpenAPI spec, then route through the gateway. Some commercial gateways automate that conversion in one click, so existing APIs become MCP-accessible without rewriting code.

Liked the post? Share on:

Book a Demo

Talk to Us

You’ve spent years battling your API problem. Give us 60 minutes to show you the solution.

Get API lifecycle management, API monetisation, and API marketplace infrastructure on one powerful AI-driven platform.