Episode 4 — Generative AI Engineering / 4.16 — Agent Design Patterns

4.16.b — Researcher-Writer Pattern

In one sentence: The Researcher-Writer pattern divides information work into two agents — a Researcher that gathers raw facts from search engines, RAG pipelines, and APIs, and a Writer that synthesizes those facts into polished, structured output — producing grounded content that neither agent could create alone.

Navigation: ← 4.16.a Planner-Executor · 4.16.c — Critic-Refiner →


1. What Is the Researcher-Writer Pattern?

The Researcher-Writer pattern separates information gathering from information synthesis:

  1. Researcher Agent — receives a topic or query, uses tools (web search, database queries, RAG retrieval, API calls) to gather relevant information, and returns structured raw facts.
  2. Writer Agent — receives the raw facts from the Researcher and synthesizes them into polished output (a report, article, email, summary, or any other formatted deliverable).

This separation is important because:

  • Gathering and writing are fundamentally different skills. A prompt optimized for thorough research ("find every relevant fact") is very different from a prompt optimized for clear writing ("synthesize into a coherent narrative").
  • Grounding. The Writer only works with facts the Researcher found — it doesn't hallucinate facts from the LLM's training data.
  • Auditability. You can inspect the Researcher's raw output to verify what facts the Writer was given.
┌────────────────────────────────────────────────────────────────┐
│               RESEARCHER-WRITER PATTERN                        │
│                                                                │
│  User: "Write a market analysis for electric vehicles"         │
│         │                                                      │
│         ▼                                                      │
│  ┌──────────────────────────────────────────────────────┐      │
│  │  RESEARCHER AGENT                                     │      │
│  │                                                       │      │
│  │  System: "You are a research agent. Gather facts.     │      │
│  │  Use tools to search, retrieve, and verify."          │      │
│  │                                                       │      │
│  │  ┌──────────┐  ┌──────────┐  ┌──────────────────┐    │      │
│  │  │ Web      │  │ RAG      │  │ API Calls        │    │      │
│  │  │ Search   │  │ Retrieval│  │ (databases, etc) │    │      │
│  │  └────┬─────┘  └────┬─────┘  └────────┬─────────┘    │      │
│  │       │              │                 │              │      │
│  │       └──────────────┼─────────────────┘              │      │
│  │                      ▼                                │      │
│  │              ┌─────────────────┐                      │      │
│  │              │  Raw Facts      │                      │      │
│  │              │  (structured    │                      │      │
│  │              │   JSON)         │                      │      │
│  │              └────────┬────────┘                      │      │
│  └───────────────────────┼──────────────────────────────┘      │
│                          │                                     │
│                          ▼                                     │
│  ┌──────────────────────────────────────────────────────┐      │
│  │  WRITER AGENT                                         │      │
│  │                                                       │      │
│  │  System: "You are a professional writer.              │      │
│  │  Synthesize ONLY the facts provided below.            │      │
│  │  Do NOT add information from your training data."     │      │
│  │                                                       │      │
│  │  Input: Raw facts from Researcher                     │      │
│  │  Output: Polished report / article / summary          │      │
│  └──────────────────────────────────────────────────────┘      │
│                          │                                     │
│                          ▼                                     │
│  Final: Polished market analysis grounded in real data         │
└────────────────────────────────────────────────────────────────┘

2. The Researcher Agent: Gathering Information

The Researcher Agent's job is to be thorough, factual, and structured. It should:

  • Search multiple sources — don't rely on a single search query
  • Extract key facts — not just raw text, but structured data points
  • Note sources — track where each fact came from for attribution
  • Assess relevance — filter out irrelevant results
  • Return structured output — JSON that the Writer can easily consume

Researcher system prompt

const RESEARCHER_SYSTEM_PROMPT = `You are a Research Agent. Your job is to gather comprehensive, factual information about a topic.

RULES:
1. Use the provided tools to search for information. Do NOT rely on your training data for facts.
2. Search multiple angles — different keywords, different sources, different perspectives.
3. Extract specific facts: numbers, dates, names, statistics, quotes.
4. Record the source for every fact.
5. Assess the relevance and reliability of each source (high/medium/low).
6. Return your findings as structured JSON.

OUTPUT FORMAT:
{
  "topic": "...",
  "research_queries": ["query1", "query2", ...],
  "facts": [
    {
      "fact": "The specific fact or data point",
      "source": "Where this fact came from",
      "source_type": "web_search | rag | api | database",
      "relevance": "high | medium | low",
      "category": "market_size | trend | competitor | regulation | ..."
    }
  ],
  "key_statistics": [
    { "metric": "...", "value": "...", "source": "...", "date": "..." }
  ],
  "gaps": ["Information I could not find"]
}

IMPORTANT: Only include facts you actually found via tools. If you cannot find information, say so in the "gaps" array.`;

Researcher with tools

import OpenAI from 'openai';

const openai = new OpenAI();

// Tool implementations
const researchTools = {
  web_search: async (query) => {
    console.log(`  [Search] "${query}"`);
    // In production: call a search API (SerpAPI, Tavily, Brave Search, etc.)
    // Simulated results:
    return [
      {
        title: 'Global EV Market Report 2025',
        snippet: 'Global EV sales reached 17.1 million units in 2024, up 25% YoY...',
        url: 'https://example.com/ev-market-2025',
      },
      {
        title: 'Top EV Manufacturers by Market Share',
        snippet: 'BYD leads with 19.4% global market share, Tesla follows at 15.2%...',
        url: 'https://example.com/ev-manufacturers',
      },
    ];
  },

  rag_retrieve: async (query) => {
    console.log(`  [RAG] Retrieving: "${query}"`);
    // In production: query your vector database
    return [
      {
        content: 'Battery costs have fallen to $139/kWh in 2024, down from $732/kWh in 2013...',
        source: 'internal_battery_report.pdf',
        relevance: 0.94,
      },
    ];
  },

  fetch_api_data: async (endpoint) => {
    console.log(`  [API] Fetching: ${endpoint}`);
    // In production: call a specific data API
    return {
      data: { ev_market_size_2024: '$500B', projected_2030: '$1.3T', cagr: '17.8%' },
      source: 'Bloomberg New Energy Finance',
    };
  },
};

// OpenAI tool definitions for the Researcher
const researcherToolDefs = [
  {
    type: 'function',
    function: {
      name: 'web_search',
      description: 'Search the web for information on a topic',
      parameters: {
        type: 'object',
        properties: {
          query: { type: 'string', description: 'The search query' },
        },
        required: ['query'],
      },
    },
  },
  {
    type: 'function',
    function: {
      name: 'rag_retrieve',
      description: 'Retrieve relevant documents from the internal knowledge base',
      parameters: {
        type: 'object',
        properties: {
          query: { type: 'string', description: 'The retrieval query' },
        },
        required: ['query'],
      },
    },
  },
  {
    type: 'function',
    function: {
      name: 'fetch_api_data',
      description: 'Fetch structured data from an external API',
      parameters: {
        type: 'object',
        properties: {
          endpoint: { type: 'string', description: 'The API endpoint to fetch data from' },
        },
        required: ['endpoint'],
      },
    },
  },
];

Running the Researcher (agent loop)

async function runResearcher(topic) {
  console.log('\n=== RESEARCHER AGENT ===');
  console.log(`Topic: "${topic}"\n`);

  const messages = [
    { role: 'system', content: RESEARCHER_SYSTEM_PROMPT },
    { role: 'user', content: `Research this topic thoroughly: ${topic}` },
  ];

  let researchComplete = false;
  let iterations = 0;
  const maxIterations = 10; // Safety limit

  while (!researchComplete && iterations < maxIterations) {
    iterations++;
    const response = await openai.chat.completions.create({
      model: 'gpt-4o',
      temperature: 0,
      tools: researcherToolDefs,
      messages,
    });

    const message = response.choices[0].message;
    messages.push(message);

    // If the model wants to call tools, execute them
    if (message.tool_calls && message.tool_calls.length > 0) {
      for (const toolCall of message.tool_calls) {
        const toolName = toolCall.function.name;
        const toolArgs = JSON.parse(toolCall.function.arguments);
        const toolFn = researchTools[toolName];

        console.log(`  Calling: ${toolName}(${JSON.stringify(toolArgs)})`);
        const result = await toolFn(toolArgs.query || toolArgs.endpoint);

        messages.push({
          role: 'tool',
          tool_call_id: toolCall.id,
          content: JSON.stringify(result),
        });
      }
    } else {
      // Model returned final text — research is complete
      researchComplete = true;
      console.log('\n  Research complete.');
      return JSON.parse(message.content);
    }
  }

  throw new Error('Research agent exceeded max iterations');
}

3. The Writer Agent: Synthesizing Information

The Writer Agent's job is to take raw facts and produce polished output. Critical rules:

  • Use ONLY the provided facts — never hallucinate additional information
  • Follow the requested format — report, article, email, bullet points, etc.
  • Cite sources — attribute facts to where they came from
  • Be clear and concise — the Writer is a writing specialist

Writer system prompt

const WRITER_SYSTEM_PROMPT = `You are a Professional Writer Agent. Your job is to synthesize research findings into polished, well-structured content.

RULES:
1. Use ONLY the facts provided in the research data below. Do NOT add facts from your training data.
2. If the research has gaps, acknowledge them honestly — do NOT fill gaps with made-up information.
3. Cite sources inline using [Source Name] format.
4. Follow the requested output format exactly.
5. Use clear, professional language appropriate for the target audience.
6. Structure content with headings, bullet points, and paragraphs for readability.
7. Include a "Sources" section at the end listing all referenced sources.

OUTPUT QUALITY:
- Executive summary at the top (2-3 sentences)
- Logical flow from high-level overview to specific details
- Key statistics highlighted
- Actionable insights or implications where appropriate`;

Writer implementation

async function runWriter(researchData, outputConfig) {
  console.log('\n=== WRITER AGENT ===');
  console.log(`Format: ${outputConfig.format}`);
  console.log(`Audience: ${outputConfig.audience}\n`);

  const writerPrompt = `Write a ${outputConfig.format} about "${researchData.topic}" for a ${outputConfig.audience} audience.

RESEARCH DATA:
${JSON.stringify(researchData, null, 2)}

REQUIREMENTS:
- Format: ${outputConfig.format}
- Length: ${outputConfig.length || 'appropriate for the content'}
- Tone: ${outputConfig.tone || 'professional'}
- Include: executive summary, key findings, statistics, implications
- End with: sources list`;

  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    temperature: 0.4, // Slightly creative for better prose
    messages: [
      { role: 'system', content: WRITER_SYSTEM_PROMPT },
      { role: 'user', content: writerPrompt },
    ],
  });

  const content = response.choices[0].message.content;
  console.log('  Writing complete.');
  console.log(`  Length: ${content.length} characters`);
  return content;
}

4. Full Implementation: Research Report Generator

Here is the complete end-to-end pipeline.

import OpenAI from 'openai';

const openai = new OpenAI();

// ─────────────────────────────────────────────────────────
// RESEARCHER AGENT (simplified — uses JSON mode instead of tool calling loop)
// ─────────────────────────────────────────────────────────

async function researchTopic(topic) {
  console.log('\n=== RESEARCHER AGENT ===');
  console.log(`Researching: "${topic}"\n`);

  // In production, this would be a full agent loop with tool calling.
  // Here we simulate the Researcher producing structured output.

  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    temperature: 0,
    response_format: { type: 'json_object' },
    messages: [
      {
        role: 'system',
        content: `You are a research agent. Given a topic, produce a structured research brief.

Return JSON with this shape:
{
  "topic": "...",
  "executive_summary": "2-3 sentence overview",
  "facts": [
    { "fact": "...", "source": "...", "category": "..." }
  ],
  "key_statistics": [
    { "metric": "...", "value": "...", "source": "...", "year": "..." }
  ],
  "trends": [
    { "trend": "...", "direction": "up|down|stable", "evidence": "..." }
  ],
  "competitive_landscape": [
    { "company": "...", "position": "...", "detail": "..." }
  ],
  "gaps": ["things you could not find or verify"]
}`,
      },
      {
        role: 'user',
        content: `Research this topic thoroughly: ${topic}`,
      },
    ],
  });

  const research = JSON.parse(response.choices[0].message.content);
  console.log(`  Found ${research.facts?.length || 0} facts`);
  console.log(`  Found ${research.key_statistics?.length || 0} statistics`);
  console.log(`  Found ${research.trends?.length || 0} trends`);
  console.log(`  Gaps: ${research.gaps?.length || 0}`);

  return research;
}

// ─────────────────────────────────────────────────────────
// WRITER AGENT
// ─────────────────────────────────────────────────────────

async function writeReport(researchData, config) {
  console.log('\n=== WRITER AGENT ===');
  console.log(`Format: ${config.format}, Audience: ${config.audience}\n`);

  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    temperature: 0.4,
    messages: [
      {
        role: 'system',
        content: `You are a professional report writer. Synthesize the provided research data into a polished ${config.format}.

CRITICAL RULES:
- Use ONLY the facts provided in the research data. Do NOT invent additional facts.
- Cite sources using [Source] format.
- If the research notes gaps, acknowledge them.
- Structure the output with clear headings and logical flow.
- Target audience: ${config.audience}
- Tone: ${config.tone || 'professional'}`,
      },
      {
        role: 'user',
        content: `Write a ${config.format} based on this research:

${JSON.stringify(researchData, null, 2)}

Include:
1. Executive Summary
2. Market Overview
3. Key Statistics
4. Trends Analysis
5. Competitive Landscape
6. Gaps and Limitations
7. Sources`,
      },
    ],
  });

  const report = response.choices[0].message.content;
  console.log(`  Report written: ${report.length} characters`);
  return report;
}

// ─────────────────────────────────────────────────────────
// ORCHESTRATOR
// ─────────────────────────────────────────────────────────

async function generateResearchReport(topic, config = {}) {
  const defaultConfig = {
    format: 'market analysis report',
    audience: 'senior management',
    tone: 'professional and data-driven',
    ...config,
  };

  console.log('========================================');
  console.log('  RESEARCHER-WRITER PIPELINE');
  console.log('========================================');
  console.log(`Topic: ${topic}`);

  // Phase 1: Research
  const researchData = await researchTopic(topic);

  // Phase 2: Write
  const report = await writeReport(researchData, defaultConfig);

  // Phase 3: Return both for auditability
  return {
    research: researchData,
    report,
    metadata: {
      topic,
      config: defaultConfig,
      researchFactCount: researchData.facts?.length || 0,
      reportLength: report.length,
      generatedAt: new Date().toISOString(),
    },
  };
}

// Run it
const result = await generateResearchReport(
  'The state of electric vehicle adoption in 2025',
  { format: 'executive briefing', audience: 'board of directors' }
);

console.log('\n=== FINAL REPORT ===\n');
console.log(result.report);
console.log('\n=== METADATA ===');
console.log(JSON.stringify(result.metadata, null, 2));

5. Use Cases

Report generation

Topic: "Q3 financial performance"
Researcher: Pulls data from financial APIs, internal databases, market feeds
Writer: Produces executive summary, charts narrative, recommendations
Output: Board-ready financial report

Content creation

Topic: "Blog post about microservices vs monoliths"
Researcher: Searches for case studies, benchmarks, expert opinions, common pitfalls
Writer: Crafts engaging blog post with examples and balanced perspective
Output: Ready-to-publish blog post

Summarization

Input: 50 pages of legal documents
Researcher: Extracts key clauses, obligations, deadlines, risks
Writer: Produces 2-page summary organized by importance
Output: Legal brief for non-legal stakeholders

Customer research

Query: "What are customers saying about our latest release?"
Researcher: Searches reviews, social media, support tickets, NPS data
Writer: Synthesizes sentiment analysis with specific quotes and trends
Output: Customer feedback report

6. Separating System Prompts: Why It Matters

A key design decision is giving each agent a separate, focused system prompt. This is more effective than a single "do everything" prompt.

Why separate prompts work better

Single-prompt approachTwo-prompt approach
"Search for info AND write a report"Researcher: "Search thoroughly" / Writer: "Write clearly"
Model tries to search and write at the same timeEach agent focuses on its specialty
Research quality suffers (model rushes to write)Research is thorough before writing begins
Hard to audit what facts were found vs generatedRaw research output is inspectable
One temperature setting for both tasksLow temp for research, higher for writing

Temperature strategy

// Researcher: temperature 0 — factual, precise, no creativity
const researcherConfig = { temperature: 0, model: 'gpt-4o' };

// Writer: temperature 0.3–0.5 — slightly creative for better prose
const writerConfig = { temperature: 0.4, model: 'gpt-4o' };

Model selection strategy

You can even use different models for each agent:

// Researcher: Use a model with large context window for processing many search results
const researcherModel = 'gpt-4o'; // Good at following complex instructions

// Writer: Use a model known for high-quality prose
const writerModel = 'gpt-4o';     // Or Claude for longer, more nuanced writing

7. Handling Multi-source Research

In production, the Researcher often needs to query multiple sources and merge results. Here is a pattern for multi-source research:

async function multiSourceResearch(topic) {
  // Run all searches in parallel
  const [webResults, ragResults, apiResults] = await Promise.all([
    searchWeb(topic),
    searchRAG(topic),
    fetchAPIData(topic),
  ]);

  // Deduplicate facts across sources
  const allFacts = [
    ...webResults.map((r) => ({ ...r, source_type: 'web' })),
    ...ragResults.map((r) => ({ ...r, source_type: 'rag' })),
    ...apiResults.map((r) => ({ ...r, source_type: 'api' })),
  ];

  // Rank by relevance
  const rankedFacts = allFacts
    .sort((a, b) => (b.relevance || 0) - (a.relevance || 0))
    .slice(0, 30); // Top 30 most relevant facts

  // Let the Researcher LLM organize and deduplicate
  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    temperature: 0,
    response_format: { type: 'json_object' },
    messages: [
      {
        role: 'system',
        content: `You are a research organizer. Given raw search results from multiple sources, 
organize them into a structured research brief. Remove duplicates. Note conflicts between sources.
Return JSON with: topic, facts[], key_statistics[], trends[], gaps[]`,
      },
      {
        role: 'user',
        content: `Organize these raw results about "${topic}":\n${JSON.stringify(rankedFacts, null, 2)}`,
      },
    ],
  });

  return JSON.parse(response.choices[0].message.content);
}

8. Validating Research Before Writing

Before passing research to the Writer, validate that the research is sufficient:

function validateResearch(research) {
  const issues = [];

  if (!research.facts || research.facts.length < 5) {
    issues.push('Insufficient facts (minimum 5 required)');
  }

  if (!research.key_statistics || research.key_statistics.length < 2) {
    issues.push('Insufficient statistics (minimum 2 required)');
  }

  if (research.gaps && research.gaps.length > research.facts.length) {
    issues.push('More gaps than facts — research may be too incomplete');
  }

  // Check for source diversity
  const sourceTypes = new Set(research.facts.map((f) => f.source_type || f.source));
  if (sourceTypes.size < 2) {
    issues.push('All facts from a single source — low diversity');
  }

  return {
    valid: issues.length === 0,
    issues,
    factCount: research.facts?.length || 0,
    statisticCount: research.key_statistics?.length || 0,
    gapCount: research.gaps?.length || 0,
    sourceDiversity: sourceTypes.size,
  };
}

// Usage in pipeline
const research = await researchTopic(topic);
const validation = validateResearch(research);

if (!validation.valid) {
  console.log('Research insufficient:', validation.issues);
  // Option 1: Re-run researcher with more specific queries
  // Option 2: Proceed but warn the Writer about gaps
  // Option 3: Abort and ask user for more context
}

9. Design Considerations

When the Researcher finds nothing

// In the Writer prompt, handle empty research gracefully
const writerPrompt = research.facts.length === 0
  ? `The research phase found no relevant information about "${topic}". 
     Write a brief note explaining that the requested analysis could not be completed 
     due to insufficient data, and suggest alternative approaches.`
  : `Write a report based on this research: ${JSON.stringify(research)}`;

Token budget management

Researcher output can be LARGE (many facts, full text snippets).
Before passing to Writer, trim to fit the Writer's context window.

Strategy:
1. Keep top 20 most relevant facts (by relevance score)
2. Keep all key_statistics (usually small)
3. Truncate fact text to 200 characters each
4. Total research payload: aim for < 4000 tokens

Combining with other patterns

The Researcher-Writer pattern often combines with:

  • Planner-Executor — the Planner includes "research" and "write" as two steps
  • Critic-Refiner — the Writer's output goes through a Critic-Refiner loop for quality improvement
  • Router — a Router agent decides whether to use Researcher-Writer, direct answer, or another pattern based on the query

10. Key Takeaways

  1. Separate gathering from synthesizing — the Researcher focuses on thoroughness and accuracy; the Writer focuses on clarity and structure. Each has its own optimized system prompt and temperature.
  2. Grounding prevents hallucination — the Writer works ONLY with Researcher-provided facts. The critical instruction is "Do NOT add information from your training data."
  3. Structured research output enables auditability — raw facts as JSON let you inspect exactly what the Writer was given, debug quality issues, and verify source attribution.
  4. Multi-source research improves quality — running parallel searches across web, RAG, and APIs, then deduplicating and ranking, produces more comprehensive research.
  5. Validate research before writing — check minimum fact count, source diversity, and gap ratio before invoking the Writer to avoid generating a report from insufficient data.
  6. Temperature differs by agent — 0 for the Researcher (factual precision), 0.3-0.5 for the Writer (readable prose).

Explain-It Challenge

  1. A product manager says "just use one prompt — tell GPT to research and write the report." Explain why splitting into Researcher and Writer produces better results.
  2. Your Researcher returns 50 facts but they are all from one website. Why is this a problem, and how do you fix it?
  3. The Writer's report includes a statistic that was NOT in the Researcher's output. What went wrong, and how do you prevent this?

Navigation: ← 4.16.a Planner-Executor · 4.16.c — Critic-Refiner →