Research Agents

Autonomous web research that searches, fetches, and synthesizes information across multiple sources. Give the agent a prompt and let it do the work.

Two Modes

LLM Mode

Pass llmApiKey for full autonomous research. The agent searches, reads pages, and synthesizes a structured answer using your LLM.

Requires: prompt + llmApiKey

LLM-Free Mode

No LLM needed. Uses BM25 scoring to extract answers from fetched pages. Fast and free, ideal for factual lookups.

Requires: urls or search (no llmApiKey)

Endpoints

POST/v1/agentAuth Required

Run synchronous research (blocks until complete) or SSE streaming with "stream": true.

POST/v1/agent/asyncAuth Required

Submit an async agent job. Returns a job ID immediately; poll GET /v1/agent/:id for results. LLM mode only.

GET/v1/agent/:idAuth Required

Poll async agent job status and retrieve results.

DELETE/v1/agent/:idAuth Required

Cancel a running async agent job.

Request Parameters — POST /v1/agent

LLM Mode (with `llmApiKey`)

Parameter	Type	Required	Description
prompt	string	Required	Research question or task. The agent will search and synthesize an answer.
llmApiKey	string	Required	Your OpenAI-compatible API key (BYOK).
llmApiBase	string	Optional	Custom LLM base URL for OpenAI-compatible endpoints.
llmModel	string	Optional	Model to use for synthesis (e.g. `gpt-4o`, `gpt-4o-mini`).
urls	string[]	Optional	Seed URLs for the agent to start from.
maxSources	number	Optional	Max pages to visit. Range: 1–20. Default: 5.
depth	string	Optional	Research depth: `basic` or `thorough`. Default: `basic`.
topic	string	Optional	Topic hint: `general`, `news`, `technical`, or `academic`.
outputSchema	object	Optional	JSON Schema for the output shape. Forces structured response.
stream	boolean	Optional	Set `true` to enable SSE streaming of progress events.
webhook	string	Optional	Webhook URL for async mode events (`/v1/agent/async` only).

LLM-Free Mode (without `llmApiKey`)

Parameter	Type	Required	Description
urls	string[]	Optional*	URLs to fetch and extract from. Required if no `search`.
search	string	Optional*	Search query. WebPeel will search and fetch result pages. Required if no `urls`.
prompt	string	Optional	Question to answer from the fetched pages (BM25 scoring).
schema	object	Optional	Field names to extract from each page using BM25.
maxResults	number	Optional	Max pages to process. Default: 5, max 20.
stream	boolean	Optional	Stream results as each URL completes.

Response

Synchronous (LLM Mode)

{
  "success": true,
  "answer": "WebPeel is a web scraping API that...",
  "sources": [
    { "url": "https://webpeel.dev", "title": "WebPeel", "excerpt": "..." }
  ],
  "pagesVisited": 4,
  "creditsUsed": 4
}

Synchronous (LLM-Free Mode)

{
  "success": true,
  "data": {
    "results": [
      {
        "url": "https://example.com",
        "title": "Example Domain",
        "extracted": { "price": "$29.99", "availability": "In Stock" },
        "confidence": 0.82,
        "content": "First 500 chars of page content..."
      }
    ],
    "totalSources": 3,
    "processingTimeMs": 2140
  },
  "metadata": { "requestId": "a1b2c3d4..." }
}

SSE Stream Events

data: {"type":"searching","message":"Searching for: WebPeel API"}
data: {"type":"fetching","url":"https://webpeel.dev","message":"Reading page..."}
data: {"type":"result","data":{...}}
data: {"type":"done","data":{"answer":"...","sources":[...],"pagesVisited":4}}

Examples

curl -X POST https://api.webpeel.dev/v1/agent \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "What are the latest pricing changes for WebPeel?",
    "llmApiKey": "sk-...",
    "llmModel": "gpt-4o-mini",
    "depth": "thorough",
    "maxSources": 8
  }'

# No LLM key — uses BM25 extraction
curl -X POST https://api.webpeel.dev/v1/agent \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "search": "WebPeel API pricing 2024",
    "prompt": "What is the free tier limit?",
    "maxResults": 5
  }'

# Submit async job
curl -X POST https://api.webpeel.dev/v1/agent/async \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Summarize the top AI news from the past week",
    "llmApiKey": "sk-...",
    "depth": "thorough",
    "topic": "news",
    "maxSources": 15
  }'

# Poll for result
curl https://api.webpeel.dev/v1/agent/JOB_ID \
  -H "Authorization: Bearer YOUR_API_KEY"

const res = await fetch('https://api.webpeel.dev/v1/agent', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.WEBPEEL_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    prompt: 'What are the main features of WebPeel?',
    llmApiKey: process.env.OPENAI_API_KEY,
    llmModel: 'gpt-4o-mini',
    maxSources: 5,
    outputSchema: {
      type: 'object',
      properties: {
        features: { type: 'array', items: { type: 'string' } },
        pricing: { type: 'string' },
        freeTier: { type: 'string' },
      },
    },
  }),
});
const result = await res.json();
console.log(result.answer);  // or result.data for structured output

Limits

Max sources: 20 per request
Credits: 1 credit per page visited
Async job TTL: 24 hours
LLM costs: Billed directly to your LLM provider key (BYOK)