The compute model is separate from build behavior. For entrypoint detection,
Python version selection, dependency installation, and server inspection, see
Build system. For the request layer in front of
deployed servers, see Gateway.
Execution contract
Horizon runs hosted servers as Python HTTP MCP servers. Your server entrypoint is started with FastMCP, bound to an internal HTTP port, and exposed through the Horizon MCP endpoint for that deployment.Python server
Horizon runs the Python entrypoint produced by the build system.
HTTP MCP endpoint
MCP clients connect to the deployment URL, usually ending in
/mcp.Request window
Hosted server requests have a -second timeout.
Memory allocation
Hosted servers currently run with MB of memory.
Ephemeral filesystem
Local files are not durable across compute instances or redeploys.
Where compute fits
The compute layer is one part of the hosted server lifecycle. This separation matters when you are debugging: locate which layer a failure came from, then look at the failures typical of that layer.| Layer | Responsibility | Typical failures |
|---|---|---|
| Build | Creates deployable artifacts. | Dependency installation, entrypoint loading, or server inspection. |
| Gateway | Receives MCP traffic. | Routing, access, or unsupported transport methods. |
| Server compute | Runs the Python code that handles supported MCP calls. | Python startup, request handlers, memory use, timeouts, or application logs. |
From artifact to serving
A build artifact is selected
A successful build produces a deployable artifact. Horizon serves traffic
from the artifact selected by the deployment, such as the current live
deployment or a preview deployment.
Server compute starts on demand
Horizon starts server compute when traffic needs to be served. The first
request to a fresh instance can take longer because Python, dependencies,
and your server module need to load.
Your server listens for HTTP MCP traffic
Horizon starts your FastMCP server over HTTP. MCP clients continue to call
the Horizon deployment URL; the gateway forwards supported requests to the
running server.
Your code handles the MCP call
Tool, resource, and prompt handlers run in your Python process. Code that
executes at import time may run during startup, before an individual MCP
request handler is called.
What runs in compute
The Python version, installed dependencies, source files, and entrypoint come from the build artifact. Horizon does not reinstall dependencies when a request arrives. To change the Python version, dependencies, entrypoint, or packaged source, update the repository or server settings and create a new build. The deployed server changes only after a successful build artifact is promoted.Environment variables
Deployment environment variables are available to the running server. Treat environment variables as configuration for startup and request handling. Changing environment variables requires a new deployment before the running server sees the new values. Avoid printing secrets to stdout or stderr; those streams become server logs.Request window and long-running work
Hosted server requests time out after seconds. This limit applies to the request from the MCP client through Horizon to your deployed server. Use request handlers for work that can complete within that window. For longer or retryable work, return a job ID quickly and continue the work asynchronously instead of keeping the MCP request open.Choose the right execution path
| Use this | For | Avoid using it for |
|---|---|---|
| MCP request handler | Interactive tool, resource, or prompt work that can finish within seconds | Long-running jobs, polling loops, or work that needs retries after the client disconnects |
| Asynchronous job with a returned handle | Longer, retryable, or asynchronous work that should continue outside the MCP request | Work that must return an immediate MCP response body |
Sessions and local state
Horizon manages MCP session routing for deployed servers. When a client initializes an MCP session, Horizon returns an MCP session ID and uses it to route later requests in that session. Session routing state is retained for hours. Clients should send themcp-session-id header on follow-up requests when their MCP client supports it.
Session routing helps Horizon keep related MCP requests together. It is not a
durable application database or a guarantee that every request in a session
reaches the same Python process. Store durable application state outside the
local filesystem.
Instances and reuse
Horizon starts server compute as needed to serve traffic. A request may be handled by a fresh Python process or by one that is already running.| Behavior | What it means |
|---|---|
| Fresh instance | Python, dependencies, and your server module need to load before the request handler runs. |
| Warm instance | The Python process is already loaded, so module-level clients or caches may still exist. |
| No affinity guarantee | A later request may be handled by a different instance, even for the same deployed server. |
| Ephemeral state | Memory and local files can disappear at any time and should not store durable application state. |
Filesystem
The deployed artifact contains your source code and installed dependencies. You can use temporary local files while handling a request, but the local filesystem is ephemeral and should not be used for durable application state. Do not rely on files written during one request being available to another request. Do not rely on files written before a redeploy being available after the redeploy.Cold starts and startup work
Horizon can start server compute on demand. The first request to a new instance may take longer than later requests because Python, dependencies, and your server module need to load. Keep import-time work small:- avoid network calls during module import
- lazy-load large clients or models when possible
- move expensive setup into request handlers or asynchronous work
- cache reusable clients in module-level variables when that is safe
Defaults and limits
| Setting | Default or limit |
|---|---|
| Server language | Python MCP or FastMCP |
| MCP transport | HTTP |
| Request timeout | seconds |
| Memory allocation | MB |
| MCP session routing TTL | hours |
| Local filesystem | Ephemeral |
| Server logs | Stdout and stderr |
Common compute failures
Requests time out
Requests time out
The handler took longer than seconds, or startup work consumed too much
of the request window. Move long work out of the request path, reduce
import-time work, or split the operation into smaller calls.
The first request is slow
The first request is slow
A fresh compute instance may need to start Python, import dependencies, and
load your server module. Keep module imports lightweight and defer expensive
setup until it is needed.
A file disappeared
A file disappeared
Local files are temporary. Use local files only as scratch space, and store
durable state outside the local filesystem.
The server runs out of memory
The server runs out of memory
Reduce per-request memory use, avoid loading large objects at import time,
stream or page large results when possible, and check memory metrics in the
server overview.
A new setting did not take effect
A new setting did not take effect
Build inputs and deployment environment variables take effect after a new
successful build and deployment. Check which deployment artifact is live.
Logs contain sensitive values
Logs contain sensitive values
Stdout and stderr are captured as server logs. Avoid printing secrets,
tokens, credentials, or full environment dumps.
Related docs
Gateway
Understand the request layer in front of deployed servers.
Build system
Learn how Horizon creates the artifact that compute runs.
Deployments
Learn how successful build artifacts are promoted and rolled back.
Environment variables
Configure values available to builds and deployed servers.
Limits
Review compute and product limits.