MCP Server
Connect WebPeel to Claude Desktop, Cursor, Windsurf, OpenClaw, and other MCP-compatible tools. Give your AI assistant powerful web scraping capabilities.
What is MCP?
The Model Context Protocol (MCP) is an open standard developed by Anthropic that lets AI assistants access external tools and data sources. With WebPeel's MCP server, your AI can:
- Fetch and extract content from any URL
- Search the web for current information
- Crawl entire websites to build knowledge bases
- Extract structured data with CSS selectors or AI prompts
- Monitor pages for changes
- Batch process multiple URLs
Quick Start
Connect to WebPeel via MCP in two ways:
- Hosted MCP endpoint — one URL, Streamable HTTP transport (recommended)
- Local MCP server — run via
npxand connect overstdio
Hosted MCP endpoint (recommended)
Use the hosted endpoint at https://api.webpeel.dev/mcp. Any MCP client that supports Streamable HTTP can connect with just:
{
"url": "https://api.webpeel.dev/mcp",
"headers": {
"Authorization": "Bearer <key>"
}
}
Stateless: no session management required.
Local MCP server
Run WebPeel's MCP server locally via npx — no installation required.
Start the Server
npx webpeel mcp
The local server runs on stdio and is automatically managed by your MCP client.
Configuration
Add WebPeel to your MCP client's configuration file:
Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%/Claude/claude_desktop_config.json (Windows):
{
"mcpServers": {
"webpeel": {
"command": "npx",
"args": ["webpeel", "mcp"]
}
}
}
Restart Claude Desktop to load the server.
Edit ~/.cursor/mcp.json:
{
"mcpServers": {
"webpeel": {
"command": "npx",
"args": ["webpeel", "mcp"]
}
}
}
Restart Cursor to apply changes.
Open VS Code Settings (JSON) and add to your cline.mcpServers config, or edit ~/.vscode/cline_mcp_settings.json:
{
"mcpServers": {
"webpeel": {
"command": "npx",
"args": ["webpeel", "mcp"]
}
}
}
Restart Cline or reload the VS Code window to load the server.
Edit ~/.windsurf/mcp_server_config.json:
{
"mcpServers": {
"webpeel": {
"command": "npx",
"args": ["webpeel", "mcp"]
}
}
}
Restart Windsurf to load the server.
Edit ~/.openclaw/mcp.json:
{
"mcpServers": {
"webpeel": {
"command": "npx",
"args": ["webpeel", "mcp"]
}
}
}
Restart OpenClaw or reload the MCP configuration.
Claude Code (CLI)
For Claude Code, use the one-liner:
claude mcp add webpeel -- npx -y webpeel mcp
Using an API Key
To use WebPeel's hosted API instead of local scraping, set the WEBPEEL_API_KEY environment variable:
{
"mcpServers": {
"webpeel": {
"command": "npx",
"args": ["webpeel", "mcp"],
"env": {
"WEBPEEL_API_KEY": "your-api-key-here"
}
}
}
}
{
"mcpServers": {
"webpeel": {
"command": "npx",
"args": ["webpeel", "mcp"],
"env": {
"WEBPEEL_API_KEY": "your-api-key-here"
}
}
}
}
{
"mcpServers": {
"webpeel": {
"command": "npx",
"args": ["webpeel", "mcp"],
"env": {
"WEBPEEL_API_KEY": "your-api-key-here"
}
}
}
}
Smithery Installation
Install via Smithery registry (one-command setup):
npx @smithery/cli install @webpeel/mcp
Available Tools
WebPeel provides 11 core MCP tools (supported by the hosted MCP endpoint): webpeel_fetch, webpeel_search, webpeel_crawl, webpeel_map, webpeel_extract, webpeel_batch, webpeel_agent, webpeel_screenshot, webpeel_brand, webpeel_summarize, webpeel_answer.
The local MCP server may expose additional tools depending on your WebPeel version.
webpeel_fetch
Fetch a URL and return clean, AI-ready markdown content.
Parameters
url(string, required) — The URL to fetchformat(string) — Output format:markdown,text, orhtml(default:markdown)render(boolean) — Force browser rendering for JavaScript-heavy sites (default:false)stealth(boolean) — Use stealth mode to bypass bot detection (default:false)wait(number) — Milliseconds to wait for dynamic content (only withrender=true)selector(string) — CSS selector to extract specific content (e.g.,"article")exclude(array) — CSS selectors to exclude (e.g.,[".sidebar", ".ads"])includeTags(array) — Only include these HTML tags/classesexcludeTags(array) — Remove these HTML tags/classesimages(boolean) — Extract image URLs (default:false)maxTokens(number) — Maximum token count (truncates if exceeded)screenshot(boolean) — Capture page screenshot (returns base64 PNG)screenshotFullPage(boolean) — Full-page screenshot (default: viewport only)location(string) — ISO country code for geo-targeting (e.g.,"US","DE")headers(object) — Custom HTTP headersactions(array) — Page actions to execute before extractionextract(object) — Structured data extraction options
Example
// Ask your AI assistant:
"Fetch https://example.com and extract the main article"
// With stealth mode:
"Fetch https://protected-site.com with stealth mode enabled"
// Extract specific content:
"Fetch https://news.com and only show me the article content, excluding nav and ads"
webpeel_search
Search the web (DuckDuckGo by default, or Brave Search with a BYOK API key) and return results with titles, URLs, and snippets.
Parameters
query(string, required) — Search querycount(number) — Number of results (1-10, default: 5)provider(string) —duckduckgo(default) orbravesearchApiKey(string) — Brave Search API key (required whenprovider="brave")
Example
// Ask your AI assistant:
"Search for 'docker containers best practices'"
// Then fetch the top result:
"Fetch the first result and summarize it"
webpeel_crawl
Crawl a website starting from a URL, following links and extracting content. Perfect for documentation sites or building knowledge bases.
Parameters
url(string, required) — Starting URLmaxPages(number) — Max pages to crawl (1-100, default: 10)maxDepth(number) — Max crawl depth (1-5, default: 2)allowedDomains(array) — Only crawl these domainsexcludePatterns(array) — Exclude URLs matching these regex patternsrespectRobotsTxt(boolean) — Respect robots.txt (default:true)rateLimitMs(number) — Rate limit between requests (default: 1000ms)sitemapFirst(boolean) — Discover URLs via sitemap first (default:false)render(boolean) — Use browser rendering for all pagesstealth(boolean) — Use stealth mode for all pages
Example
// Ask your AI assistant:
"Crawl https://docs.example.com up to 50 pages and build a summary"
webpeel_map
Discover all URLs on a domain using sitemap.xml and link crawling. Returns URLs without fetching content.
Parameters
url(string, required) — Starting URL or domainmaxUrls(number) — Max URLs to discover (1-10000, default: 5000)includePatterns(array) — Only include URLs matching these patterns (regex)excludePatterns(array) — Exclude URLs matching these patterns (regex)
Example
// Ask your AI assistant:
"Map all URLs on https://example.com/docs/"
webpeel_extract
Extract structured data from a webpage using CSS selectors, JSON schema, or AI-powered extraction.
Parameters
url(string, required) — URL to extract fromselectors(object) — Map field names to CSS selectors (e.g.,{"title": "h1", "price": ".price"})schema(object) — JSON schema describing expected outputprompt(string) — Natural language prompt for AI extraction (requiresllmApiKey)llmApiKey(string) — API key for LLM-powered extractionllmModel(string) — LLM model (default:gpt-4o-mini)llmBaseUrl(string) — LLM API base URL (default: OpenAI)render(boolean) — Use browser rendering
Example
// Ask your AI assistant:
"Extract the product title, price, and rating from https://shop.com/product/123"
// With AI extraction:
"Extract all the features mentioned on https://example.com/product as a bullet list"
webpeel_batch
Fetch multiple URLs in batch with concurrency control.
Parameters
urls(array, required) — Array of URLs to fetchconcurrency(number) — Max concurrent fetches (1-10, default: 3)render(boolean) — Use browser rendering for all URLsformat(string) — Output format for all URLsselector(string) — CSS selector for content extraction
Example
// Ask your AI assistant:
"Fetch these 5 URLs and compare their main content: [url1, url2, url3, url4, url5]"
webpeel_agent
Run WebPeel’s research agent: search → fetch → analyze → answer. Supports streaming and structured output via JSON Schema.
Parameters
prompt(string, required) — Research task promptllmApiKey(string, required) — LLM API key (BYOK)llmModel(string) — Model name (optional)depth(string) —basicorthoroughtopic(string) —general,news,technical,academicmaxSources(number) — Max sources to fetch (1-20)outputSchema(object) — JSON Schema for structured outputstream(boolean) — Stream progress + chunks via SSE
Example
// Ask your AI assistant:
"Use webpeel_agent to research the current best practices for SSE in Node.js.\nDepth: thorough. Topic: technical. Return JSON with fields: summary, pitfalls, and examples."
Additional tools (local MCP server only, if enabled):
webpeel_brand
Extract branding and design system from a URL. Returns colors, fonts, and visual identity.
Parameters
url(string, required) — URL to extract branding fromrender(boolean) — Use browser rendering
Example
// Ask your AI assistant:
"What colors and fonts does https://stripe.com use?"
webpeel_change_track
Track changes on a URL by generating a content fingerprint.
Parameters
url(string, required) — URL to trackrender(boolean) — Use browser rendering
Example
// Ask your AI assistant:
"Generate a fingerprint for https://example.com so I can detect changes later"
webpeel_summarize
Generate an AI-powered summary of a webpage using an LLM.
Parameters
url(string, required) — URL to summarizellmApiKey(string, required) — API key for LLMprompt(string) — Custom summary prompt (default: "Summarize this webpage in 2-3 sentences.")llmModel(string) — LLM model (default:gpt-4o-mini)llmBaseUrl(string) — LLM API base URLrender(boolean) — Use browser rendering
Example
// Ask your AI assistant:
"Summarize https://longform-article.com in a few sentences"
Usage Tips
When to Use Render Mode
- JavaScript-heavy sites (SPAs, React apps)
- Content loaded dynamically after page load
- Sites with lazy-loading images or infinite scroll
When to Use Stealth Mode
- Sites protected by Cloudflare or similar bot detection
- Pages that block headless browsers
- Sites with aggressive rate limiting
Best Practices
- Start simple — Use basic fetch first, escalate to render/stealth only if needed
- Use selectors — Extract only what you need with
selectorandexclude - Set maxTokens — Prevent large pages from overwhelming your context window
- Batch similar requests — Use
webpeel_batchfor multiple URLs from the same site - Respect rate limits — Add delays when crawling large sites
Troubleshooting
Server not starting
Check your MCP client's logs. Common issues:
npxnot in PATH → Install Node.js v18+- WebPeel package not found → Run
npx webpeel mcpmanually first - JSON syntax error → Validate your config file
Tools not appearing
- Restart your MCP client
- Check the config file syntax
- Verify the server starts:
npx webpeel mcp
Fetch timing out
- Use
render=falsefor simple HTML pages (faster) - Increase
waittime for slow-loading dynamic content - Check if the site blocks automated access (try
stealth=true)