MCP Server Production Troubleshooting: Common Errors and Fixes
Fix common MCP server production errors -- connection refused, timeouts, JSON-RPC parse failures, memory leaks, and transport disconnections.
The most common MCP server production errors are connection refused (wrong path or process not running), JSON-RPC parse errors (stdout corruption from print statements), tool execution timeouts (missing async handling), and transport disconnections (unhandled exceptions crashing the server). Each has a specific cause and a specific fix, and this guide covers them all.
When an MCP server works in development but fails in production -- or works with the Inspector but breaks in Claude Desktop or Cursor -- the problem almost always falls into one of the categories below. This guide is organized as a reference you can jump into by error type.
For the foundational testing and debugging workflow, see the parent guide: Testing and Debugging MCP Servers.
Error Reference Table
Use this table to jump to the relevant section:
| Error | Likely Cause | Section |
|---|---|---|
| Connection refused | Server not running, wrong path, port conflict | Connection Errors |
| ENOENT / spawn error | Binary not found, wrong command | Connection Errors |
| JSON parse error | stdout pollution, malformed response | JSON-RPC Errors |
| Method not found | Client/server version mismatch | JSON-RPC Errors |
| Tool execution timeout | Long-running operation without progress | Timeout Errors |
| Request timeout | Server unresponsive, deadlock | Timeout Errors |
| Out of memory | Unbounded data loading, memory leak | Memory Issues |
| Transport disconnected | Server crash, unhandled exception | Transport Errors |
| Permission denied | File/network access restrictions | Permission Errors |
| Rate limit exceeded | Too many requests to wrapped API | External Service Errors |
Connection Errors
Connection Refused
Symptoms: The MCP client reports "connection refused" or "failed to connect" when trying to start a server.
For stdio servers, "connection refused" usually means the server process failed to start:
- Wrong executable path. The command specified in the client config does not exist or is not executable.
{
"mcpServers": {
"my-server": {
"command": "python3",
"args": ["/path/to/server.py"]
}
}
}
Verify the path exists and is executable:
# Check if the file exists
ls -la /path/to/server.py
# Check if python3 is available at the expected location
which python3
# Try running the server directly
python3 /path/to/server.py
- Virtual environment not activated. If your server depends on packages installed in a virtual environment, the MCP client needs to use that environment's Python:
{
"mcpServers": {
"my-server": {
"command": "/home/user/projects/my-server/.venv/bin/python",
"args": ["server.py"]
}
}
}
- Missing dependencies. The server starts but crashes immediately because an import fails. Check stderr output or run the server manually to see the traceback.
For HTTP/SSE servers, connection refused means the server is not listening on the expected host and port:
# Check if anything is listening on the expected port
lsof -i :3000
# Check if the server is bound to localhost vs 0.0.0.0
netstat -tlnp | grep 3000
A common mistake is binding to 127.0.0.1 when the client is connecting from a different host (or a container). Bind to 0.0.0.0 for network-accessible servers.
ENOENT / Spawn Errors
Symptoms: Error messages containing "ENOENT", "spawn failed", or "command not found."
This means the MCP client cannot find the executable. Common causes:
| Cause | Fix |
|---|---|
| npx not in PATH | Use full path: /usr/local/bin/npx |
| Node not installed | Install Node.js and verify with node --version |
| Python not in PATH | Use full path: /usr/bin/python3 |
| UV not in PATH | Use full path: /home/user/.cargo/bin/uvx |
| Wrong working directory | Set cwd in server config |
JSON-RPC Parse Errors
stdout Pollution
Symptoms: "JSON parse error", "unexpected token", or the server connects but every tool call fails.
This is the single most common MCP server bug. Any output written to stdout that is not a valid JSON-RPC message will break the protocol. The stdio transport uses stdout exclusively for JSON-RPC communication.
Common sources of stdout pollution:
# WRONG: print() writes to stdout, corrupting the JSON-RPC stream
print("Server starting...")
print(f"Processing request for tool: {name}")
# CORRECT: Use stderr for all logging
import sys
print("Server starting...", file=sys.stderr)
# BEST: Use the logging module configured for stderr
import logging
logging.basicConfig(stream=sys.stderr, level=logging.INFO)
logger = logging.getLogger("my-mcp-server")
logger.info("Server starting...")
In TypeScript:
// WRONG: console.log writes to stdout
console.log("Processing request");
// CORRECT: console.error writes to stderr
console.error("Processing request");
How to detect stdout pollution:
Run the server manually and check if anything appears on stdout before a client connects:
# Run the server and separate stdout from stderr
python3 server.py 2>/tmp/stderr.log
# If you see ANY output in the terminal (stdout), that's the problem
# All output should go to /tmp/stderr.log (stderr)
Malformed JSON-RPC Responses
Symptoms: Parse errors from the client side, even though the server is not printing to stdout.
Check that your tool handlers return properly structured results. A common mistake is returning raw Python objects instead of serializable data:
# WRONG: datetime is not JSON-serializable
async def handle_tool(name, arguments):
return {"timestamp": datetime.now()}
# CORRECT: Convert to string
async def handle_tool(name, arguments):
return {"timestamp": datetime.now().isoformat()}
Method Not Found (-32601)
Symptoms: The client sends a request and gets back a "method not found" error.
This usually means the client and server disagree on the protocol version. Check that both are using compatible MCP SDK versions. The method names changed between early drafts and the stable specification.
Timeout Errors
Tool Execution Timeouts
Symptoms: A tool call starts but never returns, or the client reports a timeout after 30-60 seconds.
MCP clients impose timeouts on tool calls. If your tool performs a long-running operation, it must either complete within the timeout or send progress notifications to keep the connection alive.
Common timeout causes and fixes:
| Cause | Fix |
|---|---|
| HTTP request to slow API | Set request timeout, add retries with backoff |
| Large file processing | Stream results, send progress notifications |
| Database query on large dataset | Add query timeout, limit result set |
| Infinite loop in tool logic | Add iteration limits, deadline checks |
| Blocking I/O in async server | Use async I/O libraries (aiohttp, aiofiles) |
Sending progress notifications to keep the connection alive during long operations:
async def handle_long_tool(arguments, progress_token=None):
total_steps = 100
for i in range(total_steps):
# Do a chunk of work
await process_chunk(i)
# Report progress to prevent timeout
if progress_token:
await server.send_progress(
progress_token=progress_token,
progress=i + 1,
total=total_steps
)
return ToolResult(content="Processing complete")
Request Timeouts (Server Unresponsive)
Symptoms: The client cannot get any response from the server, not just from tool calls.
This indicates the server's event loop is blocked:
# WRONG: Blocking call in an async server stops all processing
async def handle_tool(name, arguments):
result = requests.get("https://slow-api.com/data") # Blocks the event loop
return result.text
# CORRECT: Use async HTTP client
async def handle_tool(name, arguments):
async with aiohttp.ClientSession() as session:
async with session.get("https://slow-api.com/data") as resp:
return await resp.text()
If you must call synchronous code from an async server, run it in a thread pool:
import asyncio
async def handle_tool(name, arguments):
loop = asyncio.get_event_loop()
result = await loop.run_in_executor(None, sync_heavy_function, arguments)
return result
Memory Issues
Memory Leaks
Symptoms: Server memory usage grows over time until the process is killed by the OS or crashes with an out-of-memory error.
Common memory leak patterns in MCP servers:
Unbounded caches:
# WRONG: Cache grows forever
cache = {}
async def handle_tool(name, arguments):
key = str(arguments)
if key not in cache:
cache[key] = await expensive_computation(arguments)
return cache[key]
# CORRECT: Use an LRU cache with a size limit
from functools import lru_cache
@lru_cache(maxsize=1000)
def cached_computation(arg_tuple):
return expensive_computation_sync(arg_tuple)
Accumulating session state:
# WRONG: Session data never cleaned up
sessions = {}
async def handle_connection(session_id):
sessions[session_id] = {"history": [], "data": {}}
# ... session is never removed from the dict
# CORRECT: Clean up on disconnect
async def handle_disconnect(session_id):
sessions.pop(session_id, None)
Large file reads held in memory:
# WRONG: Reads entire file into memory
async def handle_read_file(path):
with open(path, "r") as f:
content = f.read() # Could be gigabytes
return content
# CORRECT: Limit read size
async def handle_read_file(path):
max_size = 10 * 1024 * 1024 # 10 MB
file_size = os.path.getsize(path)
if file_size > max_size:
return f"File too large: {file_size} bytes (limit: {max_size})"
with open(path, "r") as f:
return f.read()
Monitoring Memory
Add memory tracking to your server's health endpoint or logging:
import resource
import logging
logger = logging.getLogger("mcp.health")
def log_memory_usage():
usage = resource.getrusage(resource.RUSAGE_SELF)
logger.info(f"Memory RSS: {usage.ru_maxrss / 1024:.1f} MB")
Transport Disconnections
Unhandled Exceptions
Symptoms: The server disconnects mid-session, tools stop working, and the client reports the server is no longer available.
If any exception escapes your tool handler without being caught, the MCP server process may crash, terminating the stdio pipe or SSE connection.
# WRONG: Unhandled exceptions crash the server
async def handle_tool(name, arguments):
result = 1 / arguments["divisor"] # ZeroDivisionError crashes server
return str(result)
# CORRECT: Catch exceptions and return error results
async def handle_tool(name, arguments):
try:
result = 1 / arguments["divisor"]
return ToolResult(content=str(result))
except ZeroDivisionError:
return ToolResult(
content="Error: division by zero",
is_error=True
)
except Exception as e:
logging.error(f"Tool execution failed: {e}", exc_info=True)
return ToolResult(
content=f"Internal error: {type(e).__name__}",
is_error=True
)
SSE Connection Drops
For SSE-based remote servers, connection drops can happen due to:
| Cause | Fix |
|---|---|
| Proxy timeout (nginx, Cloudflare) | Send SSE keepalive comments every 15-30s |
| Load balancer idle timeout | Configure sticky sessions, increase timeout |
| Client network change | Implement automatic reconnection with backoff |
| Server restart during deployment | Use graceful shutdown, drain connections |
SSE keepalive to prevent proxy timeouts:
async def sse_keepalive(response):
"""Send periodic SSE comments to prevent proxy timeouts."""
while True:
await asyncio.sleep(15)
await response.write(": keepalive\n\n")
Permission Errors
File System Access
Symptoms: Tools that read or write files return "permission denied" errors.
When running as a systemd service, Docker container, or different user, the MCP server may not have the same file permissions as your development user:
# Check what user the server is running as
ps aux | grep mcp-server
# Check file permissions
ls -la /path/to/target/file
# Fix: run the server as the correct user, or adjust file permissions
Network Access
Symptoms: Tools that call external APIs fail with connection errors.
Firewall rules, container network policies, or security groups may block outbound requests from the server:
# Test connectivity from the server's environment
curl -v https://api.example.com/health
# Check iptables rules (Linux)
iptables -L -n
# Check Docker network
docker network inspect bridge
Debugging with stderr
Since stdout is reserved for JSON-RPC in stdio servers, all debugging output must go through stderr. Here is how to set up structured logging:
import logging
import sys
import json
class JSONFormatter(logging.Formatter):
def format(self, record):
log_entry = {
"timestamp": self.formatTime(record),
"level": record.levelname,
"logger": record.name,
"message": record.getMessage(),
}
if record.exc_info:
log_entry["exception"] = self.formatException(record.exc_info)
return json.dumps(log_entry)
# Configure root logger to stderr with JSON formatting
handler = logging.StreamHandler(sys.stderr)
handler.setFormatter(JSONFormatter())
logging.root.addHandler(handler)
logging.root.setLevel(logging.INFO)
For Claude Desktop, server stderr is captured in the log files at:
- macOS: ~/Library/Logs/Claude/
- Windows: %APPDATA%/Claude/logs/
For Cursor, check the Output panel and select the MCP server from the dropdown.
Quick Diagnosis Flowchart
When an MCP server is not working, follow this sequence:
- Can the client start the server process? If no, check the command path, executable permissions, and dependencies.
- Does the server start without errors? Run it manually and check stderr for import errors or configuration problems.
- Does stdout stay clean? Run the server and verify nothing appears on stdout before a client connects.
- Does the Inspector work? If the server works in the Inspector but not in your client, the issue is client configuration.
- Do tool calls succeed? If connection works but tools fail, check tool handler error handling and external service connectivity.
- Does it work initially but fail over time? Suspect memory leaks, connection pool exhaustion, or accumulating state.
What to Read Next
- Testing and Debugging MCP Servers -- the parent guide with the complete debugging workflow and Inspector usage
- Building an MCP Server with Python -- Python server implementation patterns that avoid common production errors
- Building an MCP Server with Node.js -- TypeScript server patterns with proper error handling
- Deploying Remote MCP Servers -- production deployment strategies that minimize runtime errors
- Browse MCP Servers -- find production-ready servers in the directory