Skip to main content
When you deploy a hosted server, Horizon packages your Python MCP or FastMCP server and makes it available at a deployment URL. The compute model describes what happens when MCP clients call that URL: how Horizon starts your server, handles requests, records logs and metrics, and preserves enough session routing state for clients to continue a conversation.
The compute model is separate from build behavior. For entrypoint detection, Python version selection, dependency installation, and server inspection, see Build system. For the request layer in front of deployed servers, see Gateway.

Execution contract

Horizon runs hosted servers as Python HTTP MCP servers. Your server entrypoint is started with FastMCP, bound to an internal HTTP port, and exposed through the Horizon MCP endpoint for that deployment.

Python server

Horizon runs the Python entrypoint produced by the build system.

HTTP MCP endpoint

MCP clients connect to the deployment URL, usually ending in /mcp.

Request window

Hosted server requests have a -second timeout.

Memory allocation

Hosted servers currently run with MB of memory.

Ephemeral filesystem

Local files are not durable across compute instances or redeploys.

Where compute fits

The compute layer is one part of the hosted server lifecycle. This separation matters when you are debugging: locate which layer a failure came from, then look at the failures typical of that layer.
LayerResponsibilityTypical failures
BuildCreates deployable artifacts.Dependency installation, entrypoint loading, or server inspection.
GatewayReceives MCP traffic.Routing, access, or unsupported transport methods.
Server computeRuns the Python code that handles supported MCP calls.Python startup, request handlers, memory use, timeouts, or application logs.

From artifact to serving

1

A build artifact is selected

A successful build produces a deployable artifact. Horizon serves traffic from the artifact selected by the deployment, such as the current live deployment or a preview deployment.
2

Server compute starts on demand

Horizon starts server compute when traffic needs to be served. The first request to a fresh instance can take longer because Python, dependencies, and your server module need to load.
3

Your server listens for HTTP MCP traffic

Horizon starts your FastMCP server over HTTP. MCP clients continue to call the Horizon deployment URL; the gateway forwards supported requests to the running server.
4

Your code handles the MCP call

Tool, resource, and prompt handlers run in your Python process. Code that executes at import time may run during startup, before an individual MCP request handler is called.
5

Horizon records compute data

Horizon captures request outcomes, server logs, session activity, duration, memory usage, and cold-start metrics so you can debug deployed behavior.

What runs in compute

The Python version, installed dependencies, source files, and entrypoint come from the build artifact. Horizon does not reinstall dependencies when a request arrives. To change the Python version, dependencies, entrypoint, or packaged source, update the repository or server settings and create a new build. The deployed server changes only after a successful build artifact is promoted.

Environment variables

Deployment environment variables are available to the running server. Treat environment variables as configuration for startup and request handling. Changing environment variables requires a new deployment before the running server sees the new values. Avoid printing secrets to stdout or stderr; those streams become server logs.

Request window and long-running work

Hosted server requests time out after seconds. This limit applies to the request from the MCP client through Horizon to your deployed server. Use request handlers for work that can complete within that window. For longer or retryable work, return a job ID quickly and continue the work asynchronously instead of keeping the MCP request open.

Choose the right execution path

Use thisForAvoid using it for
MCP request handlerInteractive tool, resource, or prompt work that can finish within secondsLong-running jobs, polling loops, or work that needs retries after the client disconnects
Asynchronous job with a returned handleLonger, retryable, or asynchronous work that should continue outside the MCP requestWork that must return an immediate MCP response body
Horizon does not support long-lived GET /mcp streams for hosted servers. Use the standard HTTP request flow for MCP calls, and move long-running work out of the request path.

Sessions and local state

Horizon manages MCP session routing for deployed servers. When a client initializes an MCP session, Horizon returns an MCP session ID and uses it to route later requests in that session. Session routing state is retained for hours. Clients should send the mcp-session-id header on follow-up requests when their MCP client supports it.
Session routing helps Horizon keep related MCP requests together. It is not a durable application database or a guarantee that every request in a session reaches the same Python process. Store durable application state outside the local filesystem.

Instances and reuse

Horizon starts server compute as needed to serve traffic. A request may be handled by a fresh Python process or by one that is already running.
BehaviorWhat it means
Fresh instancePython, dependencies, and your server module need to load before the request handler runs.
Warm instanceThe Python process is already loaded, so module-level clients or caches may still exist.
No affinity guaranteeA later request may be handled by a different instance, even for the same deployed server.
Ephemeral stateMemory and local files can disappear at any time and should not store durable application state.
You can cache reusable clients in memory when that is safe, but in-memory state is opportunistic. Treat it as a performance optimization, not a source of truth.

Filesystem

The deployed artifact contains your source code and installed dependencies. You can use temporary local files while handling a request, but the local filesystem is ephemeral and should not be used for durable application state. Do not rely on files written during one request being available to another request. Do not rely on files written before a redeploy being available after the redeploy.

Cold starts and startup work

Horizon can start server compute on demand. The first request to a new instance may take longer than later requests because Python, dependencies, and your server module need to load. Keep import-time work small:
  • avoid network calls during module import
  • lazy-load large clients or models when possible
  • move expensive setup into request handlers or asynchronous work
  • cache reusable clients in module-level variables when that is safe

Defaults and limits

SettingDefault or limit
Server languagePython MCP or FastMCP
MCP transportHTTP
Request timeout seconds
Memory allocation MB
MCP session routing TTL hours
Local filesystemEphemeral
Server logsStdout and stderr
For a broader list of product limits, see Limits.

Common compute failures

The handler took longer than seconds, or startup work consumed too much of the request window. Move long work out of the request path, reduce import-time work, or split the operation into smaller calls.
A fresh compute instance may need to start Python, import dependencies, and load your server module. Keep module imports lightweight and defer expensive setup until it is needed.
Local files are temporary. Use local files only as scratch space, and store durable state outside the local filesystem.
Reduce per-request memory use, avoid loading large objects at import time, stream or page large results when possible, and check memory metrics in the server overview.
Build inputs and deployment environment variables take effect after a new successful build and deployment. Check which deployment artifact is live.
Stdout and stderr are captured as server logs. Avoid printing secrets, tokens, credentials, or full environment dumps.

Gateway

Understand the request layer in front of deployed servers.

Build system

Learn how Horizon creates the artifact that compute runs.

Deployments

Learn how successful build artifacts are promoted and rolled back.

Environment variables

Configure values available to builds and deployed servers.

Limits

Review compute and product limits.