Episode 4 — Generative AI Engineering / 4.16 — Agent Design Patterns
4.16.b — Researcher-Writer Pattern
In one sentence: The Researcher-Writer pattern divides information work into two agents — a Researcher that gathers raw facts from search engines, RAG pipelines, and APIs, and a Writer that synthesizes those facts into polished, structured output — producing grounded content that neither agent could create alone.
Navigation: ← 4.16.a Planner-Executor · 4.16.c — Critic-Refiner →
1. What Is the Researcher-Writer Pattern?
The Researcher-Writer pattern separates information gathering from information synthesis:
- Researcher Agent — receives a topic or query, uses tools (web search, database queries, RAG retrieval, API calls) to gather relevant information, and returns structured raw facts.
- Writer Agent — receives the raw facts from the Researcher and synthesizes them into polished output (a report, article, email, summary, or any other formatted deliverable).
This separation is important because:
- Gathering and writing are fundamentally different skills. A prompt optimized for thorough research ("find every relevant fact") is very different from a prompt optimized for clear writing ("synthesize into a coherent narrative").
- Grounding. The Writer only works with facts the Researcher found — it doesn't hallucinate facts from the LLM's training data.
- Auditability. You can inspect the Researcher's raw output to verify what facts the Writer was given.
┌────────────────────────────────────────────────────────────────┐
│ RESEARCHER-WRITER PATTERN │
│ │
│ User: "Write a market analysis for electric vehicles" │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ RESEARCHER AGENT │ │
│ │ │ │
│ │ System: "You are a research agent. Gather facts. │ │
│ │ Use tools to search, retrieve, and verify." │ │
│ │ │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────────────┐ │ │
│ │ │ Web │ │ RAG │ │ API Calls │ │ │
│ │ │ Search │ │ Retrieval│ │ (databases, etc) │ │ │
│ │ └────┬─────┘ └────┬─────┘ └────────┬─────────┘ │ │
│ │ │ │ │ │ │
│ │ └──────────────┼─────────────────┘ │ │
│ │ ▼ │ │
│ │ ┌─────────────────┐ │ │
│ │ │ Raw Facts │ │ │
│ │ │ (structured │ │ │
│ │ │ JSON) │ │ │
│ │ └────────┬────────┘ │ │
│ └───────────────────────┼──────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ WRITER AGENT │ │
│ │ │ │
│ │ System: "You are a professional writer. │ │
│ │ Synthesize ONLY the facts provided below. │ │
│ │ Do NOT add information from your training data." │ │
│ │ │ │
│ │ Input: Raw facts from Researcher │ │
│ │ Output: Polished report / article / summary │ │
│ └──────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ Final: Polished market analysis grounded in real data │
└────────────────────────────────────────────────────────────────┘
2. The Researcher Agent: Gathering Information
The Researcher Agent's job is to be thorough, factual, and structured. It should:
- Search multiple sources — don't rely on a single search query
- Extract key facts — not just raw text, but structured data points
- Note sources — track where each fact came from for attribution
- Assess relevance — filter out irrelevant results
- Return structured output — JSON that the Writer can easily consume
Researcher system prompt
const RESEARCHER_SYSTEM_PROMPT = `You are a Research Agent. Your job is to gather comprehensive, factual information about a topic.
RULES:
1. Use the provided tools to search for information. Do NOT rely on your training data for facts.
2. Search multiple angles — different keywords, different sources, different perspectives.
3. Extract specific facts: numbers, dates, names, statistics, quotes.
4. Record the source for every fact.
5. Assess the relevance and reliability of each source (high/medium/low).
6. Return your findings as structured JSON.
OUTPUT FORMAT:
{
"topic": "...",
"research_queries": ["query1", "query2", ...],
"facts": [
{
"fact": "The specific fact or data point",
"source": "Where this fact came from",
"source_type": "web_search | rag | api | database",
"relevance": "high | medium | low",
"category": "market_size | trend | competitor | regulation | ..."
}
],
"key_statistics": [
{ "metric": "...", "value": "...", "source": "...", "date": "..." }
],
"gaps": ["Information I could not find"]
}
IMPORTANT: Only include facts you actually found via tools. If you cannot find information, say so in the "gaps" array.`;
Researcher with tools
import OpenAI from 'openai';
const openai = new OpenAI();
// Tool implementations
const researchTools = {
web_search: async (query) => {
console.log(` [Search] "${query}"`);
// In production: call a search API (SerpAPI, Tavily, Brave Search, etc.)
// Simulated results:
return [
{
title: 'Global EV Market Report 2025',
snippet: 'Global EV sales reached 17.1 million units in 2024, up 25% YoY...',
url: 'https://example.com/ev-market-2025',
},
{
title: 'Top EV Manufacturers by Market Share',
snippet: 'BYD leads with 19.4% global market share, Tesla follows at 15.2%...',
url: 'https://example.com/ev-manufacturers',
},
];
},
rag_retrieve: async (query) => {
console.log(` [RAG] Retrieving: "${query}"`);
// In production: query your vector database
return [
{
content: 'Battery costs have fallen to $139/kWh in 2024, down from $732/kWh in 2013...',
source: 'internal_battery_report.pdf',
relevance: 0.94,
},
];
},
fetch_api_data: async (endpoint) => {
console.log(` [API] Fetching: ${endpoint}`);
// In production: call a specific data API
return {
data: { ev_market_size_2024: '$500B', projected_2030: '$1.3T', cagr: '17.8%' },
source: 'Bloomberg New Energy Finance',
};
},
};
// OpenAI tool definitions for the Researcher
const researcherToolDefs = [
{
type: 'function',
function: {
name: 'web_search',
description: 'Search the web for information on a topic',
parameters: {
type: 'object',
properties: {
query: { type: 'string', description: 'The search query' },
},
required: ['query'],
},
},
},
{
type: 'function',
function: {
name: 'rag_retrieve',
description: 'Retrieve relevant documents from the internal knowledge base',
parameters: {
type: 'object',
properties: {
query: { type: 'string', description: 'The retrieval query' },
},
required: ['query'],
},
},
},
{
type: 'function',
function: {
name: 'fetch_api_data',
description: 'Fetch structured data from an external API',
parameters: {
type: 'object',
properties: {
endpoint: { type: 'string', description: 'The API endpoint to fetch data from' },
},
required: ['endpoint'],
},
},
},
];
Running the Researcher (agent loop)
async function runResearcher(topic) {
console.log('\n=== RESEARCHER AGENT ===');
console.log(`Topic: "${topic}"\n`);
const messages = [
{ role: 'system', content: RESEARCHER_SYSTEM_PROMPT },
{ role: 'user', content: `Research this topic thoroughly: ${topic}` },
];
let researchComplete = false;
let iterations = 0;
const maxIterations = 10; // Safety limit
while (!researchComplete && iterations < maxIterations) {
iterations++;
const response = await openai.chat.completions.create({
model: 'gpt-4o',
temperature: 0,
tools: researcherToolDefs,
messages,
});
const message = response.choices[0].message;
messages.push(message);
// If the model wants to call tools, execute them
if (message.tool_calls && message.tool_calls.length > 0) {
for (const toolCall of message.tool_calls) {
const toolName = toolCall.function.name;
const toolArgs = JSON.parse(toolCall.function.arguments);
const toolFn = researchTools[toolName];
console.log(` Calling: ${toolName}(${JSON.stringify(toolArgs)})`);
const result = await toolFn(toolArgs.query || toolArgs.endpoint);
messages.push({
role: 'tool',
tool_call_id: toolCall.id,
content: JSON.stringify(result),
});
}
} else {
// Model returned final text — research is complete
researchComplete = true;
console.log('\n Research complete.');
return JSON.parse(message.content);
}
}
throw new Error('Research agent exceeded max iterations');
}
3. The Writer Agent: Synthesizing Information
The Writer Agent's job is to take raw facts and produce polished output. Critical rules:
- Use ONLY the provided facts — never hallucinate additional information
- Follow the requested format — report, article, email, bullet points, etc.
- Cite sources — attribute facts to where they came from
- Be clear and concise — the Writer is a writing specialist
Writer system prompt
const WRITER_SYSTEM_PROMPT = `You are a Professional Writer Agent. Your job is to synthesize research findings into polished, well-structured content.
RULES:
1. Use ONLY the facts provided in the research data below. Do NOT add facts from your training data.
2. If the research has gaps, acknowledge them honestly — do NOT fill gaps with made-up information.
3. Cite sources inline using [Source Name] format.
4. Follow the requested output format exactly.
5. Use clear, professional language appropriate for the target audience.
6. Structure content with headings, bullet points, and paragraphs for readability.
7. Include a "Sources" section at the end listing all referenced sources.
OUTPUT QUALITY:
- Executive summary at the top (2-3 sentences)
- Logical flow from high-level overview to specific details
- Key statistics highlighted
- Actionable insights or implications where appropriate`;
Writer implementation
async function runWriter(researchData, outputConfig) {
console.log('\n=== WRITER AGENT ===');
console.log(`Format: ${outputConfig.format}`);
console.log(`Audience: ${outputConfig.audience}\n`);
const writerPrompt = `Write a ${outputConfig.format} about "${researchData.topic}" for a ${outputConfig.audience} audience.
RESEARCH DATA:
${JSON.stringify(researchData, null, 2)}
REQUIREMENTS:
- Format: ${outputConfig.format}
- Length: ${outputConfig.length || 'appropriate for the content'}
- Tone: ${outputConfig.tone || 'professional'}
- Include: executive summary, key findings, statistics, implications
- End with: sources list`;
const response = await openai.chat.completions.create({
model: 'gpt-4o',
temperature: 0.4, // Slightly creative for better prose
messages: [
{ role: 'system', content: WRITER_SYSTEM_PROMPT },
{ role: 'user', content: writerPrompt },
],
});
const content = response.choices[0].message.content;
console.log(' Writing complete.');
console.log(` Length: ${content.length} characters`);
return content;
}
4. Full Implementation: Research Report Generator
Here is the complete end-to-end pipeline.
import OpenAI from 'openai';
const openai = new OpenAI();
// ─────────────────────────────────────────────────────────
// RESEARCHER AGENT (simplified — uses JSON mode instead of tool calling loop)
// ─────────────────────────────────────────────────────────
async function researchTopic(topic) {
console.log('\n=== RESEARCHER AGENT ===');
console.log(`Researching: "${topic}"\n`);
// In production, this would be a full agent loop with tool calling.
// Here we simulate the Researcher producing structured output.
const response = await openai.chat.completions.create({
model: 'gpt-4o',
temperature: 0,
response_format: { type: 'json_object' },
messages: [
{
role: 'system',
content: `You are a research agent. Given a topic, produce a structured research brief.
Return JSON with this shape:
{
"topic": "...",
"executive_summary": "2-3 sentence overview",
"facts": [
{ "fact": "...", "source": "...", "category": "..." }
],
"key_statistics": [
{ "metric": "...", "value": "...", "source": "...", "year": "..." }
],
"trends": [
{ "trend": "...", "direction": "up|down|stable", "evidence": "..." }
],
"competitive_landscape": [
{ "company": "...", "position": "...", "detail": "..." }
],
"gaps": ["things you could not find or verify"]
}`,
},
{
role: 'user',
content: `Research this topic thoroughly: ${topic}`,
},
],
});
const research = JSON.parse(response.choices[0].message.content);
console.log(` Found ${research.facts?.length || 0} facts`);
console.log(` Found ${research.key_statistics?.length || 0} statistics`);
console.log(` Found ${research.trends?.length || 0} trends`);
console.log(` Gaps: ${research.gaps?.length || 0}`);
return research;
}
// ─────────────────────────────────────────────────────────
// WRITER AGENT
// ─────────────────────────────────────────────────────────
async function writeReport(researchData, config) {
console.log('\n=== WRITER AGENT ===');
console.log(`Format: ${config.format}, Audience: ${config.audience}\n`);
const response = await openai.chat.completions.create({
model: 'gpt-4o',
temperature: 0.4,
messages: [
{
role: 'system',
content: `You are a professional report writer. Synthesize the provided research data into a polished ${config.format}.
CRITICAL RULES:
- Use ONLY the facts provided in the research data. Do NOT invent additional facts.
- Cite sources using [Source] format.
- If the research notes gaps, acknowledge them.
- Structure the output with clear headings and logical flow.
- Target audience: ${config.audience}
- Tone: ${config.tone || 'professional'}`,
},
{
role: 'user',
content: `Write a ${config.format} based on this research:
${JSON.stringify(researchData, null, 2)}
Include:
1. Executive Summary
2. Market Overview
3. Key Statistics
4. Trends Analysis
5. Competitive Landscape
6. Gaps and Limitations
7. Sources`,
},
],
});
const report = response.choices[0].message.content;
console.log(` Report written: ${report.length} characters`);
return report;
}
// ─────────────────────────────────────────────────────────
// ORCHESTRATOR
// ─────────────────────────────────────────────────────────
async function generateResearchReport(topic, config = {}) {
const defaultConfig = {
format: 'market analysis report',
audience: 'senior management',
tone: 'professional and data-driven',
...config,
};
console.log('========================================');
console.log(' RESEARCHER-WRITER PIPELINE');
console.log('========================================');
console.log(`Topic: ${topic}`);
// Phase 1: Research
const researchData = await researchTopic(topic);
// Phase 2: Write
const report = await writeReport(researchData, defaultConfig);
// Phase 3: Return both for auditability
return {
research: researchData,
report,
metadata: {
topic,
config: defaultConfig,
researchFactCount: researchData.facts?.length || 0,
reportLength: report.length,
generatedAt: new Date().toISOString(),
},
};
}
// Run it
const result = await generateResearchReport(
'The state of electric vehicle adoption in 2025',
{ format: 'executive briefing', audience: 'board of directors' }
);
console.log('\n=== FINAL REPORT ===\n');
console.log(result.report);
console.log('\n=== METADATA ===');
console.log(JSON.stringify(result.metadata, null, 2));
5. Use Cases
Report generation
Topic: "Q3 financial performance"
Researcher: Pulls data from financial APIs, internal databases, market feeds
Writer: Produces executive summary, charts narrative, recommendations
Output: Board-ready financial report
Content creation
Topic: "Blog post about microservices vs monoliths"
Researcher: Searches for case studies, benchmarks, expert opinions, common pitfalls
Writer: Crafts engaging blog post with examples and balanced perspective
Output: Ready-to-publish blog post
Summarization
Input: 50 pages of legal documents
Researcher: Extracts key clauses, obligations, deadlines, risks
Writer: Produces 2-page summary organized by importance
Output: Legal brief for non-legal stakeholders
Customer research
Query: "What are customers saying about our latest release?"
Researcher: Searches reviews, social media, support tickets, NPS data
Writer: Synthesizes sentiment analysis with specific quotes and trends
Output: Customer feedback report
6. Separating System Prompts: Why It Matters
A key design decision is giving each agent a separate, focused system prompt. This is more effective than a single "do everything" prompt.
Why separate prompts work better
| Single-prompt approach | Two-prompt approach |
|---|---|
| "Search for info AND write a report" | Researcher: "Search thoroughly" / Writer: "Write clearly" |
| Model tries to search and write at the same time | Each agent focuses on its specialty |
| Research quality suffers (model rushes to write) | Research is thorough before writing begins |
| Hard to audit what facts were found vs generated | Raw research output is inspectable |
| One temperature setting for both tasks | Low temp for research, higher for writing |
Temperature strategy
// Researcher: temperature 0 — factual, precise, no creativity
const researcherConfig = { temperature: 0, model: 'gpt-4o' };
// Writer: temperature 0.3–0.5 — slightly creative for better prose
const writerConfig = { temperature: 0.4, model: 'gpt-4o' };
Model selection strategy
You can even use different models for each agent:
// Researcher: Use a model with large context window for processing many search results
const researcherModel = 'gpt-4o'; // Good at following complex instructions
// Writer: Use a model known for high-quality prose
const writerModel = 'gpt-4o'; // Or Claude for longer, more nuanced writing
7. Handling Multi-source Research
In production, the Researcher often needs to query multiple sources and merge results. Here is a pattern for multi-source research:
async function multiSourceResearch(topic) {
// Run all searches in parallel
const [webResults, ragResults, apiResults] = await Promise.all([
searchWeb(topic),
searchRAG(topic),
fetchAPIData(topic),
]);
// Deduplicate facts across sources
const allFacts = [
...webResults.map((r) => ({ ...r, source_type: 'web' })),
...ragResults.map((r) => ({ ...r, source_type: 'rag' })),
...apiResults.map((r) => ({ ...r, source_type: 'api' })),
];
// Rank by relevance
const rankedFacts = allFacts
.sort((a, b) => (b.relevance || 0) - (a.relevance || 0))
.slice(0, 30); // Top 30 most relevant facts
// Let the Researcher LLM organize and deduplicate
const response = await openai.chat.completions.create({
model: 'gpt-4o',
temperature: 0,
response_format: { type: 'json_object' },
messages: [
{
role: 'system',
content: `You are a research organizer. Given raw search results from multiple sources,
organize them into a structured research brief. Remove duplicates. Note conflicts between sources.
Return JSON with: topic, facts[], key_statistics[], trends[], gaps[]`,
},
{
role: 'user',
content: `Organize these raw results about "${topic}":\n${JSON.stringify(rankedFacts, null, 2)}`,
},
],
});
return JSON.parse(response.choices[0].message.content);
}
8. Validating Research Before Writing
Before passing research to the Writer, validate that the research is sufficient:
function validateResearch(research) {
const issues = [];
if (!research.facts || research.facts.length < 5) {
issues.push('Insufficient facts (minimum 5 required)');
}
if (!research.key_statistics || research.key_statistics.length < 2) {
issues.push('Insufficient statistics (minimum 2 required)');
}
if (research.gaps && research.gaps.length > research.facts.length) {
issues.push('More gaps than facts — research may be too incomplete');
}
// Check for source diversity
const sourceTypes = new Set(research.facts.map((f) => f.source_type || f.source));
if (sourceTypes.size < 2) {
issues.push('All facts from a single source — low diversity');
}
return {
valid: issues.length === 0,
issues,
factCount: research.facts?.length || 0,
statisticCount: research.key_statistics?.length || 0,
gapCount: research.gaps?.length || 0,
sourceDiversity: sourceTypes.size,
};
}
// Usage in pipeline
const research = await researchTopic(topic);
const validation = validateResearch(research);
if (!validation.valid) {
console.log('Research insufficient:', validation.issues);
// Option 1: Re-run researcher with more specific queries
// Option 2: Proceed but warn the Writer about gaps
// Option 3: Abort and ask user for more context
}
9. Design Considerations
When the Researcher finds nothing
// In the Writer prompt, handle empty research gracefully
const writerPrompt = research.facts.length === 0
? `The research phase found no relevant information about "${topic}".
Write a brief note explaining that the requested analysis could not be completed
due to insufficient data, and suggest alternative approaches.`
: `Write a report based on this research: ${JSON.stringify(research)}`;
Token budget management
Researcher output can be LARGE (many facts, full text snippets).
Before passing to Writer, trim to fit the Writer's context window.
Strategy:
1. Keep top 20 most relevant facts (by relevance score)
2. Keep all key_statistics (usually small)
3. Truncate fact text to 200 characters each
4. Total research payload: aim for < 4000 tokens
Combining with other patterns
The Researcher-Writer pattern often combines with:
- Planner-Executor — the Planner includes "research" and "write" as two steps
- Critic-Refiner — the Writer's output goes through a Critic-Refiner loop for quality improvement
- Router — a Router agent decides whether to use Researcher-Writer, direct answer, or another pattern based on the query
10. Key Takeaways
- Separate gathering from synthesizing — the Researcher focuses on thoroughness and accuracy; the Writer focuses on clarity and structure. Each has its own optimized system prompt and temperature.
- Grounding prevents hallucination — the Writer works ONLY with Researcher-provided facts. The critical instruction is "Do NOT add information from your training data."
- Structured research output enables auditability — raw facts as JSON let you inspect exactly what the Writer was given, debug quality issues, and verify source attribution.
- Multi-source research improves quality — running parallel searches across web, RAG, and APIs, then deduplicating and ranking, produces more comprehensive research.
- Validate research before writing — check minimum fact count, source diversity, and gap ratio before invoking the Writer to avoid generating a report from insufficient data.
- Temperature differs by agent — 0 for the Researcher (factual precision), 0.3-0.5 for the Writer (readable prose).
Explain-It Challenge
- A product manager says "just use one prompt — tell GPT to research and write the report." Explain why splitting into Researcher and Writer produces better results.
- Your Researcher returns 50 facts but they are all from one website. Why is this a problem, and how do you fix it?
- The Writer's report includes a statistic that was NOT in the Researcher's output. What went wrong, and how do you prevent this?
Navigation: ← 4.16.a Planner-Executor · 4.16.c — Critic-Refiner →