Browser & Automation MCP Servers (Playwright, Puppeteer for Agents)
Browser automation MCP servers — Playwright, Puppeteer, Selenium integrations that let AI agents browse the web, test UIs, and extract data.
Browser automation MCP servers give AI agents the ability to interact with the web -- navigating websites, filling forms, clicking buttons, extracting data, and testing user interfaces. Built on battle-tested frameworks like Playwright and Puppeteer, these servers transform AI assistants from text-only tools into web-capable agents that can accomplish tasks across the internet.
This guide covers everything about browser automation MCP servers: how they work, which options are available, how to set them up, and the workflows they enable for web scraping, testing, and autonomous agent tasks.
How Browser Automation MCP Servers Work
Browser automation MCP servers operate by running a real web browser (Chromium, Firefox, or WebKit) and exposing browser control operations as MCP tools. When an AI assistant needs to interact with a web page, it calls these tools to:
- Navigate to a URL
- Read the page content (as an accessibility tree or structured text)
- Interact with elements (click, type, select)
- Capture screenshots or page state
- Extract structured data from the page
The key innovation is how pages are represented to the AI. Rather than sending raw HTML (which is verbose and hard to reason about), browser MCP servers typically convert the page into an accessibility tree -- a simplified representation that mirrors how screen readers see the page, with semantic labels for interactive elements.
Page: https://example.com/login
[1] heading "Sign In"
[2] textbox "Email address"
[3] textbox "Password" (password)
[4] checkbox "Remember me"
[5] button "Sign In"
[6] link "Forgot password?"
The AI then references elements by their index numbers: "Click element [5]" or "Type into element [2]".
Playwright MCP Server
Playwright is the most popular browser automation framework for MCP, maintained by Microsoft. The Playwright MCP server provides comprehensive browser control.
Installation and Setup
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": ["-y", "@playwright/mcp-server"],
"env": {
"PLAYWRIGHT_HEADLESS": "true"
}
}
}
}
For headed mode (visible browser window for debugging):
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": ["-y", "@playwright/mcp-server", "--headed"]
}
}
}
Available Tools
| Tool | Description |
|---|---|
navigate | Navigate to a URL |
screenshot | Take a screenshot of the current page |
click | Click an element by index or selector |
type | Type text into an input element |
fill | Fill a form field (clears existing content first) |
select | Select an option from a dropdown |
hover | Hover over an element |
scroll | Scroll the page or a specific element |
get_text | Get the text content of the page or element |
get_accessibility_tree | Get the accessibility tree of the page |
evaluate | Execute JavaScript in the page context |
wait_for | Wait for an element or condition |
go_back | Navigate back in browser history |
go_forward | Navigate forward |
new_tab | Open a new browser tab |
close_tab | Close the current tab |
list_tabs | List all open tabs |
Playwright Configuration Options
| Option | Description | Default |
|---|---|---|
--headed | Show browser window | Headless |
--browser chromium|firefox|webkit | Browser engine | Chromium |
--viewport 1280x720 | Browser viewport size | 1280x720 |
--user-data-dir /path | Persistent browser profile | Temporary |
--no-sandbox | Disable sandbox (Linux containers) | Sandbox enabled |
--allowed-domains example.com,*.test.com | Restrict navigation | All domains |
Example: Web Research Workflow
User: "Research the latest MCP server releases on GitHub"
Claude's workflow:
1. navigate("https://github.com/modelcontextprotocol/servers")
2. get_accessibility_tree() — understand page structure
3. click(releases_link) — navigate to releases
4. get_text() — extract release information
5. navigate("https://github.com/topics/mcp-server")
6. get_text() — find popular MCP server repositories
7. Compile findings into a structured summary
Example: Form Automation
User: "Fill out the contact form on our staging site
with test data"
Claude's workflow:
1. navigate("https://staging.example.com/contact")
2. get_accessibility_tree() — identify form fields
3. fill([2], "Test User") — name field
4. fill([3], "test@example.com") — email field
5. fill([4], "This is a test submission") — message field
6. screenshot() — capture filled form for review
7. (Await user confirmation before submitting)
8. click([5]) — submit button
Puppeteer MCP Server
Puppeteer, maintained by Google, provides Chrome/Chromium-specific browser automation:
Setup
{
"mcpServers": {
"puppeteer": {
"command": "npx",
"args": ["-y", "mcp-server-puppeteer"],
"env": {
"PUPPETEER_HEADLESS": "true"
}
}
}
}
Key Tools
| Tool | Description |
|---|---|
puppeteer_navigate | Navigate to a URL |
puppeteer_screenshot | Take a screenshot |
puppeteer_click | Click an element (CSS selector) |
puppeteer_fill | Fill a form field |
puppeteer_evaluate | Execute JavaScript |
puppeteer_select | Select dropdown option |
puppeteer_hover | Hover over element |
When to Choose Puppeteer Over Playwright
| Consideration | Playwright | Puppeteer |
|---|---|---|
| Browser support | Chromium, Firefox, WebKit | Chrome/Chromium only |
| Multi-browser testing | Excellent | Not applicable |
| Resource usage | Higher (multi-engine) | Lower (single engine) |
| Chrome DevTools Protocol | Via bridge | Native support |
| Mobile emulation | Full emulation profiles | Basic emulation |
| API complexity | Higher (more features) | Simpler API |
| Best for | Cross-browser testing, complex workflows | Chrome-specific tasks, lighter usage |
Specialized Browser MCP Servers
Beyond the main frameworks, several specialized browser MCP servers exist for specific use cases:
Browserbase MCP Server
Browserbase provides cloud-hosted browser instances for scalable web automation:
{
"mcpServers": {
"browserbase": {
"command": "npx",
"args": ["-y", "mcp-server-browserbase"],
"env": {
"BROWSERBASE_API_KEY": "your_api_key",
"BROWSERBASE_PROJECT_ID": "your_project_id"
}
}
}
}
Advantages:
- No local browser installation required
- Scalable to many concurrent sessions
- Built-in proxy and anti-detection features
- Session recording and replay
Firecrawl MCP Server
Firecrawl specializes in web scraping and content extraction:
{
"mcpServers": {
"firecrawl": {
"command": "npx",
"args": ["-y", "mcp-server-firecrawl"],
"env": {
"FIRECRAWL_API_KEY": "your_api_key"
}
}
}
}
Key Features:
- Crawl entire websites with depth control
- Convert pages to clean Markdown
- Extract structured data with AI
- Handle JavaScript-rendered content
- Respect robots.txt and rate limits
Fetch/HTTP MCP Server
For simpler HTTP requests without full browser rendering:
{
"mcpServers": {
"fetch": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-fetch"]
}
}
}
Use Cases:
- Fetching API responses
- Reading static web pages
- Downloading files
- Checking URL availability
Browser Automation Comparison
| Feature | Playwright MCP | Puppeteer MCP | Browserbase | Firecrawl | Fetch |
|---|---|---|---|---|---|
| Type | Full browser | Full browser | Cloud browser | Web scraper | HTTP client |
| JavaScript | Full rendering | Full rendering | Full rendering | Full rendering | No rendering |
| Interaction | Full (click, type, etc.) | Full | Full | Limited | None |
| Screenshots | Yes | Yes | Yes | No | No |
| Multi-browser | Yes | Chrome only | Chrome | N/A | N/A |
| Concurrency | Local instances | Local instances | Cloud-scaled | API-scaled | Lightweight |
| Best For | Testing, complex automation | Chrome-specific tasks | Scale, cloud | Content extraction | Simple fetches |
Use Case: AI-Powered Web Testing
Browser MCP servers enable powerful testing workflows when combined with AI capabilities.
Exploratory Testing
The AI autonomously explores a web application, looking for bugs:
User: "Explore our e-commerce staging site and report any issues"
Claude's workflow:
1. navigate("https://staging.example.com")
2. get_accessibility_tree() — understand the homepage
3. Test navigation: click through main menu items
4. Test search: fill search box, verify results
5. Test product pages: click products, check images, prices
6. Test cart: add items, verify quantities, check totals
7. Test forms: fill contact form, validate error messages
8. screenshot() at each step — document findings
9. Compile a test report with issues found
Accessibility Testing
User: "Test our website for accessibility issues"
Claude's workflow:
1. navigate("https://example.com")
2. get_accessibility_tree() — analyze semantic structure
3. evaluate("axe.run()") — run accessibility audit
4. Check for missing alt text, improper heading hierarchy,
color contrast issues, keyboard navigation
5. Navigate through pages and test interactive elements
6. Generate an accessibility report with WCAG compliance status
Visual Regression Testing
User: "Compare the production and staging versions of our homepage"
Claude's workflow:
1. navigate("https://example.com") → screenshot("production.png")
2. navigate("https://staging.example.com") → screenshot("staging.png")
3. Compare screenshots and identify visual differences
4. Report layout shifts, missing elements, or style changes
Use Case: Web Data Extraction
Structured Data Extraction
User: "Extract all product listings from this category page"
Claude's workflow:
1. navigate(url)
2. get_accessibility_tree() — understand page structure
3. evaluate("document.querySelectorAll('.product-card')") — find products
4. For each product: extract name, price, rating, availability
5. Handle pagination: click "Next" and repeat
6. Return structured data as JSON/CSV
Competitive Intelligence
User: "Check competitor pricing for these 5 products"
Claude's workflow:
1. For each competitor website:
a. navigate(product_url)
b. get_text() — extract pricing information
c. screenshot() — capture for reference
2. Compile comparison table with prices across competitors
3. Highlight significant price differences
Use Case: AI Agent Web Workflows
Browser MCP servers are essential building blocks for AI agents that need to interact with the web.
Multi-Step Web Workflows
User: "Book a meeting room for tomorrow 2-3pm through our
internal booking system"
Claude's workflow:
1. navigate("https://rooms.company.com")
2. Authenticate (using pre-saved session)
3. Select tomorrow's date
4. Find available rooms for 2-3pm
5. Select the best option
6. Fill booking details
7. screenshot() — confirm before submitting
8. Submit the booking (with user approval)
Form Filing and Submission
AI agents can handle repetitive form-filling tasks:
- Expense report submissions
- IT ticket creation across multiple portals
- Data entry into web-based systems
- Survey completion for testing purposes
Security Best Practices
Domain Restrictions
Restrict which domains the browser can access:
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": [
"-y", "@playwright/mcp-server",
"--allowed-domains", "example.com,*.example.com,staging.example.com"
]
}
}
}
Authentication Safety
- Never pass passwords through AI conversations
- Use pre-authenticated browser profiles with saved sessions
- Implement session timeout and re-authentication flows
- Store cookies and auth tokens securely
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": [
"-y", "@playwright/mcp-server",
"--user-data-dir", "/path/to/authenticated-profile"
]
}
}
}
Content Safety
- Filter responses for sensitive data (credit card numbers, SSNs)
- Block navigation to known malicious domains
- Disable file downloads by default
- Monitor and log all URLs accessed
Resource Isolation
- Run browser instances in containers or sandboxes
- Limit CPU and memory allocation per instance
- Close idle browser sessions automatically
- Use separate browser profiles for different security contexts
Performance Optimization
Headless Mode
Always use headless mode in production for better performance:
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": ["-y", "@playwright/mcp-server", "--headless"]
}
}
}
Headless mode reduces resource usage by approximately 30-50% compared to headed mode.
Page Load Optimization
- Block unnecessary resources (images, fonts, analytics) when only extracting text
- Use
wait_forwith specific conditions rather than fixed timeouts - Reuse browser instances across multiple page navigations
- Close tabs when done to free memory
Caching Strategies
- Cache frequently accessed pages to reduce browser invocations
- Store extracted data to avoid re-scraping unchanged pages
- Use conditional navigation (check if data changed before full scrape)
Troubleshooting
Common Issues
| Issue | Cause | Solution |
|---|---|---|
| Browser fails to launch | Missing dependencies | Install Playwright browsers: npx playwright install |
| Page loads but elements not found | Content loaded dynamically | Use wait_for tool to wait for elements |
| Screenshots are blank | Headless rendering issue | Try --headed mode for debugging |
| Timeout errors | Slow page load or network | Increase timeout settings |
| Memory errors | Too many browser instances | Close unused tabs and pages |
| Element not clickable | Overlapping elements | Use evaluate to scroll element into view |
Debugging Tips
- Switch to headed mode to see what the browser is doing
- Take screenshots at each step to verify page state
- Use
get_accessibility_treeto understand element indices - Check browser console logs with
evaluate("console.log") - Verify the page has fully loaded before interacting
Advanced Patterns
Multi-Tab Workflows
Browser MCP servers support multiple tabs for complex workflows:
User: "Compare the pricing pages of these three competitors"
Claude's workflow:
1. new_tab() → Tab 1
2. navigate("https://competitor1.com/pricing")
3. get_text() → extract pricing data
4. new_tab() → Tab 2
5. navigate("https://competitor2.com/pricing")
6. get_text() → extract pricing data
7. new_tab() → Tab 3
8. navigate("https://competitor3.com/pricing")
9. get_text() → extract pricing data
10. Compile comparison table from all three tabs
Persistent Browser Sessions
For workflows requiring authentication or state:
{
"mcpServers": {
"playwright-authenticated": {
"command": "npx",
"args": [
"-y", "@playwright/mcp-server",
"--user-data-dir", "/path/to/profile",
"--headed"
]
}
}
}
Using a persistent user data directory means:
- Login sessions persist between MCP server restarts
- Cookies and local storage are preserved
- Browser extensions remain installed
- Form autofill data is available
Intercepting Network Requests
Advanced browser MCP servers can intercept and modify network requests:
Use Cases:
- Block analytics scripts for cleaner page content
- Mock API responses for testing
- Capture API calls to understand app behavior
- Modify request headers for authentication
Integration with Other MCP Servers
Browser automation servers become most powerful when combined with other MCP servers:
Browser + Filesystem: Screenshot Documentation
User: "Take screenshots of all pages in our app for documentation"
Claude's workflow:
1. (Filesystem) Read sitemap.json or route configuration
2. For each page:
a. (Browser) navigate(page_url)
b. (Browser) wait_for(content_loaded)
c. (Browser) screenshot(page_name.png)
3. (Filesystem) Write screenshots to docs/screenshots/
4. (Filesystem) Generate an index.md linking all screenshots
Browser + Database: Dynamic Content Testing
User: "Verify that the user dashboard displays correct data
for test accounts"
Claude's workflow:
1. (Database) Query test account data
2. (Browser) Navigate to login page
3. (Browser) Log in as test user
4. (Browser) Navigate to dashboard
5. (Browser) Extract displayed values
6. Compare displayed values against database values
7. Report any discrepancies
Browser + GitHub: Visual Regression in CI
Automated visual regression workflow:
1. (GitHub) get_pull_request_files(pr) — identify changed components
2. (Browser) Navigate to production URL
3. (Browser) screenshot() — baseline screenshot
4. (Browser) Navigate to staging/preview URL
5. (Browser) screenshot() — new version screenshot
6. Compare screenshots, identify visual changes
7. (GitHub) create_pull_request_review() — report findings
Responsible Web Automation
Ethical Considerations
When using browser automation MCP servers for web interaction:
- Respect robots.txt: Check and follow site-specific automation policies
- Rate limiting: Do not overwhelm target websites with rapid requests
- Terms of service: Ensure your use case complies with the target site's ToS
- Data privacy: Be careful with personal data encountered during browsing
- Attribution: When extracting content, respect copyright and licensing
Legal Considerations
| Use Case | Generally Acceptable | Requires Caution |
|---|---|---|
| Testing your own sites | Yes | N/A |
| Public data extraction | Usually | Check ToS |
| Price comparison | Depends on jurisdiction | Check competitor ToS |
| Academic research | Generally | IRB approval may be needed |
| Login to services you own | Yes | Use dedicated test accounts |
| Automated form submission | For your own services | Get permission for third-party |
Anti-Detection and Ethics
Some websites employ bot detection. While browser MCP servers can potentially bypass detection:
- Do not bypass bot detection on sites where you are not authorized
- Do use your own authenticated sessions for services you have legitimate access to
- Do use browser automation for testing your own applications
- Do not use browser automation for unauthorized scraping at scale
Building Custom Browser MCP Servers
For specialized browser automation needs:
from mcp.server import Server
from mcp.types import Tool, TextContent, ImageContent
from playwright.async_api import async_playwright
app = Server("custom-browser")
@app.list_tools()
async def list_tools():
return [
Tool(
name="check_website_status",
description="Check if a website is up and responding correctly",
inputSchema={
"type": "object",
"properties": {
"url": {"type": "string", "description": "URL to check"},
"expected_text": {
"type": "string",
"description": "Text expected on the page"
}
},
"required": ["url"]
}
)
]
@app.call_tool()
async def call_tool(name: str, arguments: dict):
if name == "check_website_status":
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True)
page = await browser.new_page()
try:
response = await page.goto(
arguments["url"],
timeout=30000
)
status = response.status
title = await page.title()
text_found = True
if "expected_text" in arguments:
content = await page.text_content("body")
text_found = arguments["expected_text"] in content
return [TextContent(
type="text",
text=f"Status: {status}\n"
f"Title: {title}\n"
f"Expected text found: {text_found}"
)]
except Exception as e:
return [TextContent(
type="text",
text=f"Error: {str(e)}"
)]
finally:
await browser.close()
Monitoring and Health Check Workflows
Browser MCP servers enable powerful website monitoring use cases when combined with scheduling:
Uptime and Availability Monitoring
Scheduled every 15 minutes:
Agent workflow:
1. For each monitored URL:
a. navigate(url)
b. Check response status and page title
c. Verify expected content is present
d. Measure page load time
2. If any check fails:
a. screenshot() — capture the error state
b. Retry once after 30 seconds
c. If still failing, trigger alert via Slack/email
3. Log results for trend analysis
Visual Monitoring for Content Changes
User: "Monitor our competitor's pricing page for changes"
Claude's workflow:
1. navigate(competitor_pricing_url)
2. get_text() — extract current pricing data
3. Compare against the last saved snapshot
4. If changes detected:
a. screenshot() — capture new state
b. Generate a diff report highlighting what changed
c. Notify via Slack with the summary
5. Save current snapshot for future comparison
SSL Certificate and Security Monitoring
User: "Check SSL certificates for all our domains"
Claude's workflow:
1. For each domain:
a. navigate("https://domain.com")
b. evaluate("
const cert = window.performance.getEntries()[0];
return { protocol: location.protocol, secure: true };
") — verify HTTPS is active
c. Check for mixed content warnings
d. Verify no security headers are missing
2. Generate a security report:
- Certificate expiration dates
- Security header compliance (HSTS, CSP, etc.)
- Mixed content issues
- Redirect chain analysis
Browser Profile Management
Managing Multiple Authenticated Sessions
For workflows requiring access to multiple authenticated services:
{
"mcpServers": {
"playwright-internal": {
"command": "npx",
"args": [
"-y", "@playwright/mcp-server",
"--user-data-dir", "/profiles/internal-tools"
]
},
"playwright-external": {
"command": "npx",
"args": [
"-y", "@playwright/mcp-server",
"--user-data-dir", "/profiles/external-sites"
]
}
}
}
This separation ensures that internal authentication tokens are never exposed to external websites, and vice versa.
Cookie and Session Management Best Practices
| Practice | Description | Reason |
|---|---|---|
| Use separate profiles per security context | Different user-data-dir paths | Prevents cookie leakage |
| Clear sessions periodically | Delete profile and re-authenticate | Reduces stale session risk |
| Never share profiles between users | Each user gets their own profile path | Maintains access isolation |
| Use headless for production | Only use headed for debugging | Reduces attack surface |
| Set session timeouts | Configure browser to expire sessions | Limits exposure window |
Scaling Browser Automation
Cloud-Based Scaling with Browserbase
For workflows requiring many concurrent browser sessions:
Architecture for scaled browser automation:
┌──────────────┐ ┌───────────────┐ ┌──────────────┐
│ AI Client │────▶│ Browserbase │────▶│ Cloud │
│ (Claude) │ │ MCP Server │ │ Browsers │
│ │ │ │ │ (100+ inst) │
└──────────────┘ └───────────────┘ └──────────────┘
Key advantages of cloud-based scaling:
- No local resource constraints
- Geographic distribution for location-specific testing
- Built-in proxy rotation for scraping use cases
- Session recording and replay for debugging
- Automatic browser updates and patch management
Resource Management Guidelines
| Deployment Size | Concurrent Browsers | RAM Required | CPU Required |
|---|---|---|---|
| Single developer | 1-2 | 512 MB | 1 core |
| Small team | 3-5 | 2 GB | 2 cores |
| CI/CD pipeline | 5-10 | 4 GB | 4 cores |
| Production monitoring | 10-50 | 16 GB | 8 cores |
| Large-scale scraping | 50+ | Cloud (Browserbase) | Cloud |
What to Read Next
- MCP for AI Agents -- Build autonomous workflows with browser capabilities
- MCP in Software Development -- AI-powered testing and development
- Developer Tools MCP Servers -- Code execution and CI/CD servers
- Browse Browser Automation Servers -- Find browser MCP servers in our directory
Frequently Asked Questions
What is the Playwright MCP server?
The Playwright MCP server is a Model Context Protocol server that gives AI applications the ability to control web browsers through Playwright's automation framework. It exposes tools for navigating to URLs, clicking elements, filling forms, taking screenshots, extracting page content, and running automated test sequences. This enables AI agents to interact with web applications just as a human user would.
How do browser MCP servers differ from web scraping?
Browser MCP servers provide full browser automation — they render JavaScript, handle dynamic content, interact with SPAs, fill forms, and simulate user actions. Traditional web scraping typically only fetches and parses static HTML. Browser servers use real browser engines (Chromium, Firefox, WebKit) and can handle authentication flows, cookie management, and complex user interactions that simple HTTP requests cannot.
Can AI agents browse the internet through MCP?
Yes. Browser automation MCP servers give AI agents the ability to navigate websites, read page content, click links, fill out forms, and extract information. The AI receives structured representations of web pages (accessibility trees or simplified DOM) rather than raw HTML, making it easier to understand and interact with web content. However, you should implement guardrails to prevent accessing inappropriate or unauthorized content.
Is the Playwright MCP server safe to use?
The Playwright MCP server is safe when configured properly. It runs a real browser that can be sandboxed (headless mode, restricted network access, isolated profile). Key safety measures include: running in headless mode to prevent desktop interference, restricting navigation to allowed domains, disabling file downloads, and requiring user confirmation for sensitive actions like form submissions.
What is the difference between Playwright MCP and Puppeteer MCP?
Playwright MCP supports multiple browser engines (Chromium, Firefox, WebKit) and is maintained by Microsoft. Puppeteer MCP is specifically for Chrome/Chromium and is maintained by Google. Playwright generally offers more features (auto-waiting, better mobile emulation, parallel execution), while Puppeteer has a simpler API and lighter resource usage. Both provide similar core functionality for MCP-based browser automation.
Can I use browser MCP servers for automated testing?
Yes. Browser automation MCP servers are excellent for AI-assisted testing workflows. The AI can navigate your application, interact with UI elements, verify expected behavior, and report issues. Common patterns include exploratory testing (AI discovers and tests flows autonomously), regression testing (AI runs predefined scenarios), and accessibility testing (AI evaluates page accessibility).
How do browser MCP servers handle authentication?
Browser MCP servers handle authentication through several methods: (1) the AI fills in login forms with provided credentials, (2) pre-authenticated browser profiles with saved cookies/sessions, (3) injected authentication tokens or cookies before navigation, and (4) OAuth flows where the AI navigates the authorization process. For security, use pre-authenticated profiles rather than passing passwords through the AI.
What are the resource requirements for browser MCP servers?
Browser automation servers are more resource-intensive than other MCP servers because they run actual browser instances. Expect each browser instance to use 100-300 MB of RAM. Headless mode uses less resources than headed mode. For servers that need multiple concurrent browser instances, allocate at least 512 MB RAM per instance. Consider closing browser tabs/pages when not in use to free resources.
Related Guides
MCP servers for developer tools — code execution sandboxes, Figma design-to-code, linting, testing frameworks, and CI/CD integrations.
How MCP enables powerful AI agents — tool selection, multi-step workflows, agent architectures, and real-world examples of autonomous AI systems.
How MCP transforms the software development lifecycle — code generation, review, testing, CI/CD, and deployment with AI-powered MCP workflows.