Composability in MCP: Building Hierarchical AI Systems
How MCP's composable architecture enables agents that are both clients and servers, creating powerful hierarchical AI systems.
Composability in MCP
Composability is MCP's architectural property that allows any component to simultaneously act as both a client and a server, enabling hierarchical AI systems where specialized agents orchestrate other agents through the same standardized protocol. This is one of the most powerful and underutilized features of the Model Context Protocol.
While most MCP deployments use a simple flat architecture -- one host connecting to several independent servers -- composability enables advanced patterns for multi-agent systems, complex enterprise workflows, and orchestration platforms where AI agents collaborate to accomplish tasks no single agent could handle alone.
Understanding Composability
The Flat Model (Most Common)
The standard MCP deployment is flat: one host, multiple independent servers:
┌─────────────────────────────────────┐
│ HOST │
│ ┌───────┐ ┌───────┐ ┌───────┐ │
│ │Client │ │Client │ │Client │ │
│ │ A │ │ B │ │ C │ │
│ └───┬───┘ └───┬───┘ └───┬───┘ │
└──────┼─────────┼─────────┼─────────┘
│ │ │
┌────▼───┐ ┌──▼────┐ ┌──▼────┐
│GitHub │ │File │ │DB │
│Server │ │Server │ │Server │
└────────┘ └───────┘ └───────┘
Each server is independent. The host's AI model decides which tools to use and in what order. This works well for most scenarios.
The Composable Model
In a composable architecture, servers can also be clients to other servers, creating a hierarchy:
┌─────────────────────────────────┐
│ HOST │
│ ┌───────┐ │
│ │Client │ │
│ └───┬───┘ │
└─────────────┼───────────────────┘
│
┌────────▼─────────┐
│ Orchestrator │ ← Acts as SERVER to host
│ Server │ ← Acts as CLIENT to sub-servers
│ │
│ ┌─────┐ ┌─────┐ │
│ │Cl. 1│ │Cl. 2│ │
│ └──┬──┘ └──┬──┘ │
└─────┼───────┼────┘
│ │
┌─────▼──┐ ┌──▼─────┐
│GitHub │ │Docker │
│Server │ │Server │
└────────┘ └────────┘
The orchestrator is simultaneously:
- An MCP server that exposes high-level tools (like
deploy_application) to the host - An MCP client to GitHub and Docker servers, using their tools to execute the deployment
Why This Matters
Composability enables patterns that flat architectures cannot achieve:
- Abstraction: Complex multi-tool workflows are abstracted into single high-level tools
- Specialization: Each server handles one domain, orchestrated by higher-level agents
- Reuse: The same GitHub server works in any composition -- unchanged
- Encapsulation: The orchestrator hides internal complexity from the host
- Independent evolution: Each layer can be updated independently
Composability Patterns
Pattern 1: Orchestrator Server
The most common composable pattern. An orchestrator server exposes high-level workflow tools while internally delegating to specialized servers.
Host → Orchestrator → [GitHub, Docker, Monitoring, Slack]
User says: "Deploy the latest changes to staging"
Orchestrator's "deploy" tool:
1. Calls GitHub server: check CI status on main branch
2. Calls GitHub server: merge PR to staging branch
3. Calls Docker server: build and push new image
4. Calls Docker server: update staging deployment
5. Calls Monitoring server: check health metrics
6. Calls Slack server: notify #engineering channel
Implementation:
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import { StdioClientTransport } from "@modelcontextprotocol/sdk/client/stdio.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";
// Create the orchestrator server
const server = new McpServer({
name: "deployment-orchestrator",
version: "1.0.0",
});
// Connect to sub-servers as a client
const githubClient = new Client({ name: "orchestrator", version: "1.0.0" });
const dockerClient = new Client({ name: "orchestrator", version: "1.0.0" });
const slackClient = new Client({ name: "orchestrator", version: "1.0.0" });
async function initSubClients() {
await githubClient.connect(new StdioClientTransport({
command: "npx",
args: ["-y", "@modelcontextprotocol/server-github"],
env: { GITHUB_PERSONAL_ACCESS_TOKEN: process.env.GITHUB_TOKEN! },
}));
await dockerClient.connect(new StdioClientTransport({
command: "node",
args: ["./docker-server.js"],
}));
await slackClient.connect(new StdioClientTransport({
command: "npx",
args: ["-y", "@modelcontextprotocol/server-slack"],
env: { SLACK_BOT_TOKEN: process.env.SLACK_TOKEN! },
}));
}
// Expose a high-level deployment tool
server.tool(
"deploy_to_staging",
"Deploy the latest changes to the staging environment. This will check CI, merge to staging, build a Docker image, deploy it, and notify the team.",
{
repo: z.string().describe("Repository in owner/repo format"),
branch: z.string().default("main").describe("Branch to deploy from"),
notifyChannel: z.string().default("#engineering").describe("Slack channel to notify"),
},
async ({ repo, branch, notifyChannel }) => {
const steps: string[] = [];
// Step 1: Check CI status
const ciResult = await githubClient.callTool("get_branch_status", {
owner: repo.split("/")[0],
repo: repo.split("/")[1],
branch,
});
steps.push(`CI Check: ${ciResult.content[0].text}`);
// Step 2: Merge to staging
const mergeResult = await githubClient.callTool("merge_branches", {
owner: repo.split("/")[0],
repo: repo.split("/")[1],
base: "staging",
head: branch,
});
steps.push(`Merge: ${mergeResult.content[0].text}`);
// Step 3: Build Docker image
const buildResult = await dockerClient.callTool("build_image", {
tag: `${repo}:staging-latest`,
context: ".",
});
steps.push(`Build: ${buildResult.content[0].text}`);
// Step 4: Deploy
const deployResult = await dockerClient.callTool("update_service", {
service: `${repo.split("/")[1]}-staging`,
image: `${repo}:staging-latest`,
});
steps.push(`Deploy: ${deployResult.content[0].text}`);
// Step 5: Notify
await slackClient.callTool("send_message", {
channel: notifyChannel,
text: `Deployed ${repo} (${branch}) to staging. Steps:\n${steps.join("\n")}`,
});
steps.push(`Notification sent to ${notifyChannel}`);
return {
content: [{
type: "text",
text: `Deployment complete!\n\n${steps.map((s, i) => `${i + 1}. ${s}`).join("\n")}`,
}],
};
}
);
// Start everything
await initSubClients();
const transport = new StdioServerTransport();
await server.connect(transport);
Pattern 2: Delegation Chain
A chain of specialized agents, each adding a layer of capability:
Host → Research Agent → [Web Search, Document Reader, Knowledge Base]
└──► Analysis Agent → [Database, Calculator, Visualizer]
from mcp.server.fastmcp import FastMCP
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
mcp = FastMCP("research-agent")
# This agent can search the web and then delegate analysis
@mcp.tool()
async def research_and_analyze(topic: str, depth: str = "standard") -> str:
"""Research a topic and provide analysis with data.
Searches the web for information, then delegates to the
analysis agent for data-driven insights.
Args:
topic: The topic to research
depth: Research depth (quick, standard, deep)
"""
# Use web search sub-server
search_params = StdioServerParameters(
command="npx",
args=["-y", "@modelcontextprotocol/server-brave-search"],
env={"BRAVE_API_KEY": os.environ["BRAVE_API_KEY"]},
)
async with stdio_client(search_params) as (read, write):
async with ClientSession(read, write) as search_session:
await search_session.initialize()
# Search for the topic
search_result = await search_session.call_tool(
"brave_web_search",
{"query": topic, "count": 10}
)
# Delegate to analysis agent for data processing
analysis_params = StdioServerParameters(
command="python",
args=["analysis_agent.py"],
)
async with stdio_client(analysis_params) as (read, write):
async with ClientSession(read, write) as analysis_session:
await analysis_session.initialize()
analysis_result = await analysis_session.call_tool(
"analyze_data",
{
"raw_data": search_result.content[0].text,
"analysis_type": "trend_analysis",
}
)
return f"""Research Results for: {topic}
## Web Research
{search_result.content[0].text}
## Analysis
{analysis_result.content[0].text}"""
Pattern 3: Fan-Out / Fan-In
An orchestrator that dispatches tasks to multiple agents in parallel and aggregates results:
┌──────────────────┐
│ Orchestrator │
└──────┬───────────┘
│
┌─────────────┼─────────────┐
│ │ │
┌────▼───┐ ┌────▼───┐ ┌────▼───┐
│ Code │ │Security│ │ Perf │
│ Review │ │ Audit │ │ Check │
│ Agent │ │ Agent │ │ Agent │
└────────┘ └────────┘ └────────┘
│ │ │
└─────────────┼─────────────┘
│
┌──────▼───────────┐
│ Aggregated │
│ Report │
└──────────────────┘
server.tool(
"comprehensive_code_review",
"Run a comprehensive code review with parallel specialized checks",
{
prNumber: z.number().describe("Pull request number"),
repo: z.string().describe("Repository in owner/repo format"),
},
async ({ prNumber, repo }) => {
// Get the PR diff first
const diff = await githubClient.callTool("get_pull_request_diff", {
owner: repo.split("/")[0],
repo: repo.split("/")[1],
pull_number: prNumber,
});
// Fan out to specialized review agents in parallel
const [codeReview, securityAudit, perfCheck] = await Promise.all([
codeReviewClient.callTool("review_code", {
diff: diff.content[0].text,
focus: "correctness",
}),
securityClient.callTool("audit_code", {
diff: diff.content[0].text,
severity: "all",
}),
perfClient.callTool("check_performance", {
diff: diff.content[0].text,
benchmarks: true,
}),
]);
// Fan in: aggregate results
return {
content: [{
type: "text",
text: `## Comprehensive Review of PR #${prNumber}
### Code Quality
${codeReview.content[0].text}
### Security Audit
${securityAudit.content[0].text}
### Performance Check
${perfCheck.content[0].text}
### Summary
Review complete. See individual sections above for details.`,
}],
};
}
);
Pattern 4: Pipeline
Sequential processing where each agent transforms and passes data to the next:
Host → Extraction Agent → Transformation Agent → Loading Agent → Validation Agent
Data Pipeline:
1. Extract: Pull data from source systems
2. Transform: Clean, normalize, enrich
3. Load: Write to target systems
4. Validate: Verify data integrity
Pattern 5: Supervisor
A supervisor agent that monitors other agents and intervenes when needed:
┌────────────────┐
│ Supervisor │
│ Agent │
└───┬──────┬────┘
│ │
monitors │ │ monitors
│ │
┌──────▼┐ ┌──▼─────┐
│Worker │ │Worker │
│Agent 1│ │Agent 2 │
└───────┘ └────────┘
Supervisor can:
- Monitor worker progress
- Redistribute tasks if a worker fails
- Aggregate results when workers complete
- Escalate issues that workers cannot handle
Sampling: AI Reasoning in the Hierarchy
What Sampling Enables
The sampling capability allows an MCP server to request LLM completions through its client connection. This is powerful for composability because middle-tier servers can leverage AI reasoning without hosting their own model:
┌──────────┐ ┌─────────────┐ ┌──────────┐
│ Host │ │Orchestrator │ │Tool │
│ + LLM │◄────│Server │◄────│Server │
│ │ │(no LLM) │ │ │
└──────────┘ └─────────────┘ └──────────┘
Tool Server calls tool → result
Orchestrator needs to interpret result using AI
Orchestrator sends sampling request → travels up to Host
Host generates LLM completion → sends back down
Orchestrator uses the completion in its workflow
Sampling Flow
// Server sends sampling request to client
{
"jsonrpc": "2.0",
"id": 10,
"method": "sampling/createMessage",
"params": {
"messages": [
{
"role": "user",
"content": {
"type": "text",
"text": "Analyze this error log and determine if it indicates a critical issue:\n\n[error log contents]"
}
}
],
"maxTokens": 500,
"systemPrompt": "You are a log analysis expert. Classify the severity of errors."
}
}
// Client (via host) returns the LLM completion
{
"jsonrpc": "2.0",
"id": 10,
"result": {
"role": "assistant",
"content": {
"type": "text",
"text": "CRITICAL: This error indicates a database connection pool exhaustion..."
},
"model": "claude-4-sonnet-20260101"
}
}
Human-in-the-Loop
The host can require user consent before completing sampling requests. This ensures that the user remains in control of what the AI generates, even when the request originates from a server deep in the hierarchy:
Server → Client: sampling/createMessage(...)
Client → Host: "Server 'monitoring' wants AI to analyze error logs. Allow?"
Host → User: [Consent dialog]
User → Host: "Yes, allow"
Host → Model: Generate completion
Model → Host: Completion
Host → Client → Server: Result
Real-World Composability Examples
Example 1: Enterprise Data Pipeline
Host (Claude Desktop)
└── Data Pipeline Orchestrator
├── Salesforce MCP Server (extract CRM data)
├── PostgreSQL MCP Server (extract operational data)
├── Data Transform Server (clean, normalize, join)
├── BigQuery MCP Server (load into warehouse)
└── Slack MCP Server (send pipeline completion report)
Example 2: Automated Code Review Platform
Host (Custom CI/CD Integration)
└── Code Review Orchestrator
├── GitHub Server (fetch PR details, post comments)
├── Static Analysis Server (run linters, type checkers)
├── Security Scanner Server (vulnerability detection)
├── Test Runner Server (execute test suite)
└── Documentation Server (check doc coverage)
Example 3: Customer Support Agent
Host (Support Platform)
└── Support Orchestrator
├── CRM Server (customer history, account details)
├── Knowledge Base Server (search help articles)
├── Ticket System Server (create, update, close tickets)
├── Product Server (check feature flags, known issues)
└── Escalation Server (route to human agents when needed)
Design Principles for Composable Systems
1. Acyclic Dependencies
Never create circular dependencies between servers:
// BAD: Circular dependency
Server A → Server B → Server A (infinite loop!)
// GOOD: Acyclic hierarchy
Host → Orchestrator → Server A
→ Server B
2. Clear Responsibility Boundaries
Each server should have a single, well-defined domain:
// BAD: Server does everything
"super-server" → GitHub + Docker + Slack + Database + Files
// GOOD: Single responsibility
"github-server" → GitHub only
"docker-server" → Docker only
"slack-server" → Slack only
3. Idempotent Tool Design
Tools in composable systems should be idempotent when possible. If an orchestrator retries a failed step, the tool should produce the same result:
@mcp.tool()
async def create_or_update_issue(repo: str, title: str, body: str) -> str:
"""Create a new issue or update if one with the same title exists.
This is idempotent — calling it multiple times with the same title
will update the existing issue rather than creating duplicates.
"""
existing = await find_issue_by_title(repo, title)
if existing:
return await update_issue(repo, existing["number"], body)
else:
return await create_issue(repo, title, body)
4. Error Propagation
Errors should propagate up the hierarchy with enough context for the orchestrator to make recovery decisions:
server.tool("step_in_workflow", "...", schema, async (params) => {
try {
const result = await subClient.callTool("do_something", params);
return result;
} catch (error) {
return {
content: [{
type: "text",
text: `Step failed: ${error.message}\n\nRecovery options:\n1. Retry (transient error)\n2. Skip this step\n3. Abort the workflow`,
}],
isError: true,
};
}
});
5. Depth Limiting
Prevent deeply nested compositions that become hard to debug:
const MAX_DEPTH = 5;
async function callWithDepthCheck(client, tool, args, currentDepth) {
if (currentDepth >= MAX_DEPTH) {
throw new Error(`Maximum composition depth (${MAX_DEPTH}) reached`);
}
return await client.callTool(tool, { ...args, _depth: currentDepth + 1 });
}
When to Use Composability
Good Candidates for Composable Architecture
| Scenario | Why Composability Helps |
|---|---|
| Multi-step deployment pipelines | Abstracts complex workflows into single tools |
| Cross-domain data processing | Each domain has its own specialized server |
| Enterprise workflow automation | Encapsulates business logic in orchestration layers |
| Multi-agent AI systems | Agents collaborate through standard protocol |
| CI/CD and DevOps automation | Tools chain naturally (test, build, deploy, monitor) |
When to Stay Flat
| Scenario | Why Flat Is Simpler |
|---|---|
| Individual developer tools | No need for orchestration |
| Single-domain tools | One server covers everything needed |
| Simple tool collections | Host's AI model handles orchestration fine |
| Prototype or MVP | Add composability when complexity demands it |
Summary
MCP's composability is a powerful architectural property that enables hierarchical AI systems, multi-agent collaboration, and complex workflow orchestration. By allowing any component to act as both client and server, MCP creates a protocol that scales from simple single-server setups to sophisticated enterprise automation platforms.
Most users should start with flat architectures and adopt composability when their use cases demand it. When that time comes, MCP's composable design means they can add hierarchy without redesigning their existing servers.
Continue learning:
- MCP Architecture -- The foundation composability builds on
- MCP for AI Agents -- Agent workflows powered by composability
- Why MCP Matters -- The strategic value of a composable standard
- Browse MCP Servers -- Find servers to compose
Frequently Asked Questions
What is composability in MCP?
Composability in MCP means that any MCP component can simultaneously act as both a client and a server. An MCP server can also be an MCP client to other servers, creating hierarchical chains where one agent orchestrates others. This enables multi-agent systems where specialized agents collaborate through the same standardized protocol.
How can an MCP server also be a client?
An MCP server exposes tools to upstream clients while maintaining its own MCP client connections to downstream servers. For example, an orchestrator server might expose a 'deploy_application' tool to the user's AI assistant, and internally use MCP client connections to a GitHub server, a Docker server, and a monitoring server to execute the deployment.
What is a hierarchical MCP architecture?
A hierarchical MCP architecture is one where MCP servers are arranged in layers. The top-level host connects to orchestrator servers, which in turn connect to specialized tool servers. This creates a tree structure where each level delegates to the level below it, enabling complex multi-step workflows while maintaining clean separation of concerns.
How does MCP support multi-agent systems?
MCP supports multi-agent systems by allowing each agent to act as an MCP server (exposing its capabilities to other agents) and an MCP client (using tools from other agents). Agents can discover each other's capabilities through standard MCP tool discovery, delegate tasks to specialized agents, and aggregate results — all through the same protocol.
What are the benefits of composable MCP architectures?
Benefits include separation of concerns (each server handles one domain), reusability (a GitHub server works in any composition), independent scaling (scale only the servers that need it), easier testing (test each server in isolation), and flexible orchestration (rearrange the hierarchy without rewriting servers).
Can MCP composability create infinite loops?
Yes, if not designed carefully. If Server A calls Server B which calls Server A, an infinite loop occurs. This is prevented through good architectural design: hierarchies should be acyclic (no circular dependencies), and servers should have clear upstream/downstream relationships. Some implementations add depth limits or cycle detection.
What is the sampling capability and how does it relate to composability?
Sampling is an MCP capability that allows a server to request LLM completions through the client. This is powerful for composability because a server in the middle of a hierarchy can ask for AI reasoning without hosting its own model. The request travels up the chain to the host, which generates the completion and sends it back down.
Is composability required for using MCP?
No. Most MCP deployments use a simple flat architecture: one host connecting to several independent servers. Composability is an advanced pattern for complex use cases like multi-agent systems, enterprise workflows, and orchestration platforms. Simple setups work perfectly well without any composability.
Related Guides
A comprehensive breakdown of the MCP architecture — how clients, servers, hosts, and transports work together to enable AI-tool communication.
Understand why the Model Context Protocol is critical for the future of AI — solving fragmentation, enabling agents, and creating a universal standard.
How MCP enables powerful AI agents — tool selection, multi-step workflows, agent architectures, and real-world examples of autonomous AI systems.