Deep Research
Turn any question into a comprehensive, cited research report. WebPeel's deep research agent automatically decomposes your question, searches multiple sources across multiple rounds, and synthesizes a verified report β all in one API call.
llm field. Supported providers: openai, anthropic, google, ollama, cerebras, cloudflare. If your server has a default LLM configured, the llm field is optional.
How It Works
-
Query Decomposition
The LLM breaks your question into focused sub-queries to maximize search coverage.
-
Multi-Round Search
Each sub-query runs against DuckDuckGo (or Brave), deduplicating URLs across rounds. Up to 5 rounds, 30 sources total.
-
Source Fetching & Scoring
Top sources are fetched in parallel. Each source is scored by credibility tier (official β β β , verified β β β, general β ββ) and relevance.
-
Gap Detection
After each round, the LLM identifies gaps in coverage and generates new sub-queries to fill them.
-
Report Synthesis
The LLM writes a structured report β Executive Summary, Key Findings, Detailed Analysis, Conclusion β with inline citations to every source used.
Endpoint
Multi-step research agent. Returns a comprehensive cited report. Supports SSE streaming for real-time progress.
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
| question | string | Required | The research question (max 5000 characters). |
| llm | object | BYOK | LLM configuration object. Required if no server-side default is configured. See BYOK Setup below. |
| maxRounds | number | Optional | Research rounds (1β5). Default: 3. More rounds = deeper coverage, more time. |
| maxSources | number | Optional | Maximum sources to consider (5β30). Default: 20. |
| stream | boolean | Optional | Enable SSE streaming for real-time progress and incremental report text. Default: false. |
LLM Object (llm)
| Field | Type | Description |
|---|---|---|
| provider | string | LLM provider: openai, anthropic, google, ollama, cerebras, cloudflare. Default: openai. |
| apiKey | string | Your API key for the selected provider. Not required for ollama (local). |
| model | string | Model name override (e.g. gpt-4o-mini, claude-3-5-haiku-20241022, gemini-2.0-flash). |
| endpoint | string | Custom API endpoint URL. Useful for Ollama (http://localhost:11434) or OpenAI-compatible proxies. |
Response (Non-Streaming)
{
"success": true,
"report": "## Executive Summary\n\nBased on 14 sources across 3 research rounds, quantum computing has achieved significant milestones in 2024β2025...\n\n## Key Findings\n\n1. **Error correction breakthroughs** β Google's Willow chip demonstrated below-threshold error correction for the first time [1][3]\n2. **Commercial availability** β IBM, Google, and IonQ now offer cloud access to 100+ qubit processors [2]\n3. **Timeline revised** β Most experts now project fault-tolerant quantum computers by 2030β2035 [5][7]\n\n## Detailed Analysis\n\n### Hardware Progress\nThe past 18 months saw rapid progress in qubit fidelity...\n\n## Conclusion\n\nQuantum computing is transitioning from research to early commercial use, but broad practical advantage remains 5β10 years away.\n\n**Confidence: HIGH**",
"citations": [
{
"index": 1,
"title": "Google Willow: Our New Quantum Chip",
"url": "https://blog.google/technology/research/google-willow-quantum-chip/",
"snippet": "Willow can perform a computation in under five minutes that would take today's fastest supercomputers 10 septillion years.",
"relevanceScore": 0.94
},
{
"index": 2,
"title": "IBM Quantum Network",
"url": "https://www.ibm.com/quantum/network",
"snippet": "Access IBM quantum systems with 127β1000+ qubits through the IBM Quantum Network.",
"relevanceScore": 0.87
}
],
"sourcesUsed": 14,
"roundsCompleted": 3,
"totalSearchQueries": 9,
"llmProvider": "openai",
"tokensUsed": { "input": 18420, "output": 2341 },
"elapsed": 34201
}
Response Fields
| Field | Type | Description |
|---|---|---|
| report | string | Full Markdown research report with inline citations. |
| citations | array | All cited sources with index, title, URL, snippet, and relevance score. |
| sourcesUsed | number | Total sources fetched and analyzed. |
| roundsCompleted | number | Research rounds completed (β€ maxRounds). |
| totalSearchQueries | number | Total search queries run across all rounds. |
| llmProvider | string | The LLM provider that was used. |
| tokensUsed | object | { input: number, output: number } β LLM tokens consumed. |
| elapsed | number | Total time in milliseconds. |
SSE Streaming
Set "stream": true to receive real-time progress events via Server-Sent Events. The connection stays open while research runs; events stream as they occur.
Each event is a JSON object on a data: SSE line:
Progress updates during research. Fields: type, message, round, data (optional extra info).
data: {"eventType":"progress","type":"decomposing","message":"Decomposing question into sub-queriesβ¦","round":0}
data: {"eventType":"progress","type":"searching","message":"Searching: quantum error correction 2024","round":0}
data: {"eventType":"progress","type":"fetching","message":"Fetching 8 sourcesβ¦","round":0}
data: {"eventType":"progress","type":"synthesizing","message":"Synthesizing report from 14 sourcesβ¦"}
Incremental report text as the LLM generates it. Concatenate all chunks to build the full report.
data: {"type":"chunk","text":"## Executive Summary\n\n"}
data: {"type":"chunk","text":"Based on 14 sources across 3 research rounds"}
Final event with metadata. The full report is assembled from all chunk events.
data: {"type":"done","citations":[...],"sourcesUsed":14,"roundsCompleted":3,"totalSearchQueries":9,"llmProvider":"openai","tokensUsed":{"input":18420,"output":2341},"elapsed":34201}
Emitted if research fails. Check message for details.
data: {"type":"error","message":"LLM API key rejected by provider"}
BYOK β Bring Your Own LLM Key
Deep research uses LLMs for query decomposition, gap detection, and report synthesis. You provide your own key β WebPeel never stores it.
OpenAI
"provider": "openai"
Default model: gpt-4o-mini
Best for: speed + quality balance
Anthropic
"provider": "anthropic"
Default model: claude-3-5-haiku-20241022
Best for: long-context synthesis
"provider": "google"
Default model: gemini-2.0-flash
Best for: cost efficiency
Ollama (local)
"provider": "ollama"
No API key needed
Best for: privacy / self-hosted
Cerebras
"provider": "cerebras"
Default model: llama3.1-70b
Best for: ultra-fast inference
Code Examples
curl -X POST https://api.webpeel.dev/v1/deep-research \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"question": "What are the current state and limitations of quantum computing?",
"llm": {
"provider": "openai",
"apiKey": "sk-...",
"model": "gpt-4o-mini"
},
"maxRounds": 3,
"maxSources": 20
}'
# Stream progress and report text in real-time
curl -X POST https://api.webpeel.dev/v1/deep-research \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-H "Accept: text/event-stream" \
-d '{
"question": "What are the current state and limitations of quantum computing?",
"llm": { "provider": "openai", "apiKey": "sk-..." },
"stream": true
}'
# Example SSE output:
# data: {"eventType":"progress","type":"decomposing","message":"Decomposing question...","round":0}
# data: {"eventType":"progress","type":"searching","message":"Searching: quantum computing 2025","round":0}
# data: {"type":"chunk","text":"## Executive Summary\n\n"}
# data: {"type":"done","sourcesUsed":12,"roundsCompleted":3,...}
const response = await fetch('https://api.webpeel.dev/v1/deep-research', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.WEBPEEL_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
question: 'What are the current state and limitations of quantum computing?',
llm: {
provider: 'openai',
apiKey: process.env.OPENAI_API_KEY,
model: 'gpt-4o-mini',
},
maxRounds: 3,
maxSources: 20,
}),
});
const data = await response.json();
console.log(data.report); // Full markdown report
console.log(data.citations.length); // Number of cited sources
console.log(`${data.sourcesUsed} sources, ${data.roundsCompleted} rounds, ${data.elapsed}ms`);
import { createParser } from 'eventsource-parser';
const response = await fetch('https://api.webpeel.dev/v1/deep-research', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.WEBPEEL_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
question: 'What is the current state of quantum computing?',
llm: { provider: 'openai', apiKey: process.env.OPENAI_API_KEY },
stream: true,
}),
});
let report = '';
const parser = createParser((event) => {
if (event.type !== 'event') return;
const data = JSON.parse(event.data);
if (data.eventType === 'progress') {
process.stdout.write(`[${data.type}] ${data.message}\n`);
} else if (data.type === 'chunk') {
process.stdout.write(data.text);
report += data.text;
} else if (data.type === 'done') {
console.log(`\n\nDone: ${data.sourcesUsed} sources, ${data.elapsed}ms`);
console.log('Citations:', data.citations.length);
} else if (data.type === 'error') {
console.error('Error:', data.message);
}
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
parser.feed(decoder.decode(value));
}
import requests, os
response = requests.post(
'https://api.webpeel.dev/v1/deep-research',
headers={
'Authorization': f'Bearer {os.environ["WEBPEEL_API_KEY"]}',
'Content-Type': 'application/json',
},
json={
'question': 'What are the current state and limitations of quantum computing?',
'llm': {
'provider': 'openai',
'apiKey': os.environ['OPENAI_API_KEY'],
'model': 'gpt-4o-mini',
},
'maxRounds': 3,
'maxSources': 20,
},
timeout=120, # deep research can take 30-90s
)
data = response.json()
print(data['report'])
print(f"\n{data['sourcesUsed']} sources, {data['roundsCompleted']} rounds, {data['elapsed']}ms")
print(f"Citations: {len(data['citations'])}")
for cite in data['citations']:
print(f" [{cite['index']}] {cite['title']} β {cite['url']}")
import requests, json, os
with requests.post(
'https://api.webpeel.dev/v1/deep-research',
headers={
'Authorization': f'Bearer {os.environ["WEBPEEL_API_KEY"]}',
'Content-Type': 'application/json',
},
json={
'question': 'What is the current state of quantum computing?',
'llm': {'provider': 'openai', 'apiKey': os.environ['OPENAI_API_KEY']},
'stream': True,
},
stream=True,
timeout=120,
) as resp:
report_parts = []
for line in resp.iter_lines():
if not line or not line.startswith(b'data: '):
continue
event = json.loads(line[6:])
if event.get('eventType') == 'progress':
print(f"[{event['type']}] {event['message']}")
elif event.get('type') == 'chunk':
print(event['text'], end='', flush=True)
report_parts.append(event['text'])
elif event.get('type') == 'done':
print(f"\n\nComplete: {event['sourcesUsed']} sources, {event['elapsed']}ms")
elif event.get('type') == 'error':
print(f"Error: {event['message']}")
report = ''.join(report_parts)
Rate Limits & Scopes
| Plan | Requests/hour | Max rounds | Max sources |
|---|---|---|---|
| Free | 5 | 2 | 10 |
| Starter | 20 | 3 | 20 |
| Pro | 60 | 5 | 30 |
| Enterprise | Unlimited | 5 | 30 |
Deep research requires a full or read scope API key. Create one at app.webpeel.dev/keys.
Error Codes
| Error Code | Status | Cause |
|---|---|---|
authentication_required | 401 | Missing or invalid API key. |
invalid_request | 400 | Missing question, question too long (>5000 chars), or invalid llm.provider. |
llm_required | 400 | No llm config provided and no server default configured. |
llm_auth_failed | 401 | The LLM provider rejected the API key. |
rate_limit_exceeded | 429 | WebPeel rate limit hit. Retry after the indicated interval. |
free_tier_limit | 429 | Free tier LLM quota exhausted. Provide your own API key to continue. |
deep_research_failed | 500 | Internal error during research. Check hint for details. |
Deep Research vs. /v1/research vs. /v1/ask
| /v1/deep-research | /v1/research | /v1/ask | |
|---|---|---|---|
| Output | Full cited research report (Markdown) | Concise synthesis paragraph | Single best-answer sentence |
| LLM required | Yes (BYOK) | No (self-hosted Ollama by default, BYOK optional) | No (BM25 scoring) |
| Search rounds | 1β5 (iterative, gap-filling) | 1 | 1 |
| Sources | Up to 30 | Up to 5 | Up to 5 |
| Time | 30β120 seconds | 5β15 seconds | 1β5 seconds |
| Best for | Research reports, deep dives | Quick research with AI summary | Quick facts, RAG pipelines |
See Also
- Research API β Lightweight research with self-hosted LLM (no BYOK needed)
- Ask API β Fast LLM-free web Q&A with BM25 source scoring
- Search API β Raw search results from DuckDuckGo/Brave/Google
- Extract API β Structured JSON extraction from any page
- Error Reference β All error codes and troubleshooting