Skip to main content
The Horizon gateway is the request layer in front of deployed MCP servers. It is the part of Horizon that receives client traffic, identifies the deployment, applies the server’s access settings, preserves MCP session routing, and sends the request to the currently promoted deployment artifact.
The gateway runs before your server code. Server compute starts after the gateway has accepted and routed a request. For the Python execution contract, see Compute model.

Request flow

Every request follows the same broad path.
1

A client calls the deployment URL

The client sends an MCP request to a Horizon server URL. Hosted servers usually expose an endpoint ending in /mcp.
2

Horizon identifies the deployment

Horizon uses the hostname and path to identify which deployment should receive the request. Deployment slugs are stable for a branch or live server, so repeated requests route to the same deployment until promotion or rollback changes what is live.
3

Horizon checks access

Horizon applies the server’s configured authentication and authorization mode before your server receives the request. If access is denied, the request stops at the gateway.
4

Horizon preserves MCP session routing

For initialized MCP sessions, Horizon keeps enough routing state to send follow-up requests in the same session to the right backend path.
5

Horizon routes to the live artifact

Horizon forwards the request to the deployment artifact currently selected for that server. Your Python MCP or FastMCP server handles the tool, resource, or prompt call.

What the gateway controls

Endpoint routing

Maps incoming deployment URLs to the currently live deployment artifact.

Access enforcement

Applies server access settings before requests reach server code.

MCP sessions

Issues and preserves MCP session IDs for follow-up requests.

Request observability

Records request-level metadata used for logs, analytics, and debugging.

Limits

Request, session, rate, and payload limits enforced at the gateway.

Deployment routing

Deployments are addressed by deployment slugs. The gateway uses that slug to decide which deployed artifact should receive a request. For slug stability semantics and how promotion changes what is live, see Deployments. Changing source code does not change gateway routing by itself. A new build must succeed, and the resulting artifact must be promoted, before the gateway routes traffic to the new version.
If a client is still seeing old behavior, check the deployment page to confirm which artifact is currently live before debugging server code.

Access checks

The gateway enforces the server’s configured access mode before forwarding the request. Authentication and authorization decisions happen at the gateway, before your server code runs. Your server can still implement its own application-level checks, but gateway access settings determine whether the request reaches your server at all. When the gateway authenticates a caller, it attaches the verified caller identity to the request so your server code and downstream access-aware features can make their own authorization decisions against the same identity. For access mode options and how caller identity is verified, see Authentication and Authorization.

MCP sessions

When a client initializes an MCP session, Horizon returns an mcp-session-id. Clients should send that header on all follow-up requests in the same session. Session routing state is retained for hours. This is routing state for MCP traffic, not durable application storage. Store durable application state outside the local filesystem. For session and other product limits, see Limits.

Protocol behavior

Horizon hosted servers use Streamable HTTP as the MCP transport. Clients send MCP requests as HTTP POST requests to the deployment URL.
OperationSupportedGateway behavior
POST /mcpYesForwarded to your server.
Long-lived GET /mcp SSE streamNoReturns method-not-allowed without invoking your server.
DELETE /mcp session teardownNoReturns method-not-allowed without invoking your server.
Horizon does not support long-lived GET /mcp server-sent event streams for hosted servers. Use the standard POST-based request flow for MCP calls, and return a job ID quickly for long-running work instead of holding the request open.

Gateway vs compute

The gateway is responsible for getting the request to the right deployed server. Server compute is responsible for running your Python MCP or FastMCP code.
ConcernGatewayServer compute
Identify deploymentYesNo
Enforce server access modeYesNo
Preserve MCP session routingIssues and routes by mcp-session-idReceives routed requests and returns MCP headers
Run Python server codeNoYes
Capture stdout and stderrNoYes
Enforce request timeoutReturns an error to the client if the deadline is exceededTerminates the Python handler if the deadline is exceeded

Common gateway outcomes

The caller did not provide credentials accepted by the server’s access mode, or the credentials were expired or malformed.
The caller is authenticated, but Horizon access settings do not allow that caller to use the server or requested capability.
The URL does not map to a known deployment, the deployment was removed, or a custom domain is not pointing at the expected server.
The client attempted a transport operation Horizon does not support for hosted servers, such as a long-lived GET /mcp stream or DELETE /mcp.
Gateway routing succeeded. Check your server logs and the Compute model for errors from your Python server.

Compute model

Learn how Horizon runs deployed Python servers.

Build system

Learn how deployable server artifacts are created.

Authentication

Learn how callers prove identity.

Authorization

Learn how Horizon decides what authenticated callers can do.

Limits

Review request, session, and compute limits.