Documentation / Deep Research

💡 Try this endpoint in the Playground →

Deep Research

Turn any question into a comprehensive, cited research report. WebPeel's deep research agent automatically decomposes your question, searches multiple sources across multiple rounds, and synthesizes a verified report — all in one API call.

BYOK (Bring Your Own Key): Deep research requires an LLM for synthesis. Pass your key in the llm field. Supported providers: openai, anthropic, google, ollama, cerebras, cloudflare. If your server has a default LLM configured, the llm field is optional.

How It Works

Query Decomposition
The LLM breaks your question into focused sub-queries to maximize search coverage.
Multi-Round Search
Each sub-query runs against DuckDuckGo (or Brave), deduplicating URLs across rounds. Up to 5 rounds, 30 sources total.
Source Fetching & Scoring
Top sources are fetched in parallel. Each source is scored by credibility tier (official ★★★, verified ★★☆, general ★☆☆) and relevance.
Gap Detection
After each round, the LLM identifies gaps in coverage and generates new sub-queries to fill them.
Report Synthesis
The LLM writes a structured report — Executive Summary, Key Findings, Detailed Analysis, Conclusion — with inline citations to every source used.

Endpoint

POST/v1/deep-researchAuth Required

Multi-step research agent. Returns a comprehensive cited report. Supports SSE streaming for real-time progress.

Request Body

Parameter	Type	Required	Description
question	string	Required	The research question (max 5000 characters).
llm	object	BYOK	LLM configuration object. Required if no server-side default is configured. See BYOK Setup below.
maxRounds	number	Optional	Research rounds (1–5). Default: `3`. More rounds = deeper coverage, more time.
maxSources	number	Optional	Maximum sources to consider (5–30). Default: `20`.
stream	boolean	Optional	Enable SSE streaming for real-time progress and incremental report text. Default: `false`.

LLM Object (`llm`)

Field	Type	Description
provider	string	LLM provider: `openai`, `anthropic`, `google`, `ollama`, `cerebras`, `cloudflare`. Default: `openai`.
apiKey	string	Your API key for the selected provider. Not required for `ollama` (local).
model	string	Model name override (e.g. `gpt-4o-mini`, `claude-3-5-haiku-20241022`, `gemini-2.0-flash`).
endpoint	string	Custom API endpoint URL. Useful for Ollama (`http://localhost:11434`) or OpenAI-compatible proxies.

Response (Non-Streaming)

{
  "success": true,
  "report": "## Executive Summary\n\nBased on 14 sources across 3 research rounds, quantum computing has achieved significant milestones in 2024–2025...\n\n## Key Findings\n\n1. **Error correction breakthroughs** — Google's Willow chip demonstrated below-threshold error correction for the first time [1][3]\n2. **Commercial availability** — IBM, Google, and IonQ now offer cloud access to 100+ qubit processors [2]\n3. **Timeline revised** — Most experts now project fault-tolerant quantum computers by 2030–2035 [5][7]\n\n## Detailed Analysis\n\n### Hardware Progress\nThe past 18 months saw rapid progress in qubit fidelity...\n\n## Conclusion\n\nQuantum computing is transitioning from research to early commercial use, but broad practical advantage remains 5–10 years away.\n\n**Confidence: HIGH**",
  "citations": [
    {
      "index": 1,
      "title": "Google Willow: Our New Quantum Chip",
      "url": "https://blog.google/technology/research/google-willow-quantum-chip/",
      "snippet": "Willow can perform a computation in under five minutes that would take today's fastest supercomputers 10 septillion years.",
      "relevanceScore": 0.94
    },
    {
      "index": 2,
      "title": "IBM Quantum Network",
      "url": "https://www.ibm.com/quantum/network",
      "snippet": "Access IBM quantum systems with 127–1000+ qubits through the IBM Quantum Network.",
      "relevanceScore": 0.87
    }
  ],
  "sourcesUsed": 14,
  "roundsCompleted": 3,
  "totalSearchQueries": 9,
  "llmProvider": "openai",
  "tokensUsed": { "input": 18420, "output": 2341 },
  "elapsed": 34201
}

Response Fields

Field	Type	Description
report	string	Full Markdown research report with inline citations.
citations	array	All cited sources with index, title, URL, snippet, and relevance score.
sourcesUsed	number	Total sources fetched and analyzed.
roundsCompleted	number	Research rounds completed (≤ `maxRounds`).
totalSearchQueries	number	Total search queries run across all rounds.
llmProvider	string	The LLM provider that was used.
tokensUsed	object	`{ input: number, output: number }` — LLM tokens consumed.
elapsed	number	Total time in milliseconds.

SSE Streaming

Set "stream": true to receive real-time progress events via Server-Sent Events. The connection stays open while research runs; events stream as they occur.

Each event is a JSON object on a data: SSE line:

type: "progress"

Progress updates during research. Fields: type, message, round, data (optional extra info).

data: {"eventType":"progress","type":"decomposing","message":"Decomposing question into sub-queries…","round":0}
data: {"eventType":"progress","type":"searching","message":"Searching: quantum error correction 2024","round":0}
data: {"eventType":"progress","type":"fetching","message":"Fetching 8 sources…","round":0}
data: {"eventType":"progress","type":"synthesizing","message":"Synthesizing report from 14 sources…"}

type: "chunk"

Incremental report text as the LLM generates it. Concatenate all chunks to build the full report.

data: {"type":"chunk","text":"## Executive Summary\n\n"}
data: {"type":"chunk","text":"Based on 14 sources across 3 research rounds"}

type: "done"

Final event with metadata. The full report is assembled from all chunk events.

data: {"type":"done","citations":[...],"sourcesUsed":14,"roundsCompleted":3,"totalSearchQueries":9,"llmProvider":"openai","tokensUsed":{"input":18420,"output":2341},"elapsed":34201}

type: "error"

Emitted if research fails. Check message for details.

data: {"type":"error","message":"LLM API key rejected by provider"}

BYOK — Bring Your Own LLM Key

Deep research uses LLMs for query decomposition, gap detection, and report synthesis. You provide your own key — WebPeel never stores it.

OpenAI

"provider": "openai"
Default model: gpt-4o-mini
Best for: speed + quality balance

Anthropic

"provider": "anthropic"
Default model: claude-3-5-haiku-20241022
Best for: long-context synthesis

Google

"provider": "google"
Default model: gemini-2.0-flash
Best for: cost efficiency

Ollama (local)

"provider": "ollama"
No API key needed
Best for: privacy / self-hosted

Cerebras

"provider": "cerebras"
Default model: llama3.1-70b
Best for: ultra-fast inference

Code Examples

curl -X POST https://api.webpeel.dev/v1/deep-research \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "question": "What are the current state and limitations of quantum computing?",
    "llm": {
      "provider": "openai",
      "apiKey": "sk-...",
      "model": "gpt-4o-mini"
    },
    "maxRounds": 3,
    "maxSources": 20
  }'

# Stream progress and report text in real-time
curl -X POST https://api.webpeel.dev/v1/deep-research \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H "Accept: text/event-stream" \
  -d '{
    "question": "What are the current state and limitations of quantum computing?",
    "llm": { "provider": "openai", "apiKey": "sk-..." },
    "stream": true
  }'

# Example SSE output:
# data: {"eventType":"progress","type":"decomposing","message":"Decomposing question...","round":0}
# data: {"eventType":"progress","type":"searching","message":"Searching: quantum computing 2025","round":0}
# data: {"type":"chunk","text":"## Executive Summary\n\n"}
# data: {"type":"done","sourcesUsed":12,"roundsCompleted":3,...}

const response = await fetch('https://api.webpeel.dev/v1/deep-research', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.WEBPEEL_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    question: 'What are the current state and limitations of quantum computing?',
    llm: {
      provider: 'openai',
      apiKey: process.env.OPENAI_API_KEY,
      model: 'gpt-4o-mini',
    },
    maxRounds: 3,
    maxSources: 20,
  }),
});

const data = await response.json();

console.log(data.report);            // Full markdown report
console.log(data.citations.length);  // Number of cited sources
console.log(`${data.sourcesUsed} sources, ${data.roundsCompleted} rounds, ${data.elapsed}ms`);

import { createParser } from 'eventsource-parser';

const response = await fetch('https://api.webpeel.dev/v1/deep-research', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.WEBPEEL_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    question: 'What is the current state of quantum computing?',
    llm: { provider: 'openai', apiKey: process.env.OPENAI_API_KEY },
    stream: true,
  }),
});

let report = '';
const parser = createParser((event) => {
  if (event.type !== 'event') return;
  const data = JSON.parse(event.data);

  if (data.eventType === 'progress') {
    process.stdout.write(`[${data.type}] ${data.message}\n`);
  } else if (data.type === 'chunk') {
    process.stdout.write(data.text);
    report += data.text;
  } else if (data.type === 'done') {
    console.log(`\n\nDone: ${data.sourcesUsed} sources, ${data.elapsed}ms`);
    console.log('Citations:', data.citations.length);
  } else if (data.type === 'error') {
    console.error('Error:', data.message);
  }
});

const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  parser.feed(decoder.decode(value));
}

import requests, os

response = requests.post(
    'https://api.webpeel.dev/v1/deep-research',
    headers={
        'Authorization': f'Bearer {os.environ["WEBPEEL_API_KEY"]}',
        'Content-Type': 'application/json',
    },
    json={
        'question': 'What are the current state and limitations of quantum computing?',
        'llm': {
            'provider': 'openai',
            'apiKey': os.environ['OPENAI_API_KEY'],
            'model': 'gpt-4o-mini',
        },
        'maxRounds': 3,
        'maxSources': 20,
    },
    timeout=120,  # deep research can take 30-90s
)

data = response.json()
print(data['report'])
print(f"\n{data['sourcesUsed']} sources, {data['roundsCompleted']} rounds, {data['elapsed']}ms")
print(f"Citations: {len(data['citations'])}")
for cite in data['citations']:
    print(f"  [{cite['index']}] {cite['title']} — {cite['url']}")

import requests, json, os

with requests.post(
    'https://api.webpeel.dev/v1/deep-research',
    headers={
        'Authorization': f'Bearer {os.environ["WEBPEEL_API_KEY"]}',
        'Content-Type': 'application/json',
    },
    json={
        'question': 'What is the current state of quantum computing?',
        'llm': {'provider': 'openai', 'apiKey': os.environ['OPENAI_API_KEY']},
        'stream': True,
    },
    stream=True,
    timeout=120,
) as resp:
    report_parts = []
    for line in resp.iter_lines():
        if not line or not line.startswith(b'data: '):
            continue
        event = json.loads(line[6:])
        if event.get('eventType') == 'progress':
            print(f"[{event['type']}] {event['message']}")
        elif event.get('type') == 'chunk':
            print(event['text'], end='', flush=True)
            report_parts.append(event['text'])
        elif event.get('type') == 'done':
            print(f"\n\nComplete: {event['sourcesUsed']} sources, {event['elapsed']}ms")
        elif event.get('type') == 'error':
            print(f"Error: {event['message']}")

report = ''.join(report_parts)

Rate Limits & Scopes

Plan	Requests/hour	Max rounds	Max sources
Free	5	2	10
Starter	20	3	20
Pro	60	5	30
Enterprise	Unlimited	5	30

Deep research requires a full or read scope API key. Create one at app.webpeel.dev/keys.

Error Codes

Error Code	Status	Cause
`authentication_required`	401	Missing or invalid API key.
`invalid_request`	400	Missing `question`, question too long (>5000 chars), or invalid `llm.provider`.
`llm_required`	400	No `llm` config provided and no server default configured.
`llm_auth_failed`	401	The LLM provider rejected the API key.
`rate_limit_exceeded`	429	WebPeel rate limit hit. Retry after the indicated interval.
`free_tier_limit`	429	Free tier LLM quota exhausted. Provide your own API key to continue.
`deep_research_failed`	500	Internal error during research. Check `hint` for details.

Deep Research vs. /v1/research vs. /v1/ask

	/v1/deep-research	/v1/research	/v1/ask
Output	Full cited research report (Markdown)	Concise synthesis paragraph	Single best-answer sentence
LLM required	Yes (BYOK)	No (self-hosted Ollama by default, BYOK optional)	No (BM25 scoring)
Search rounds	1–5 (iterative, gap-filling)	1	1
Sources	Up to 30	Up to 5	Up to 5
Time	30–120 seconds	5–15 seconds	1–5 seconds
Best for	Research reports, deep dives	Quick research with AI summary	Quick facts, RAG pipelines

New: /v1/research — A lightweight research endpoint that chains search → fetch → compile using WebPeel's self-hosted LLM. No API key needed for the LLM — works on the free tier. See the API Reference for details.

Deep Research

How It Works

Endpoint

Request Body

LLM Object (llm)

Response (Non-Streaming)

Response Fields

SSE Streaming

BYOK — Bring Your Own LLM Key

OpenAI

Anthropic

Google

Ollama (local)

Cerebras

Code Examples

Rate Limits & Scopes

Error Codes

Deep Research vs. /v1/research vs. /v1/ask

See Also

LLM Object (`llm`)