Episode 4 — Generative AI Engineering / 4.15 — Understanding AI Agents

4.15.e — Multi-Agent Complexity

In one sentence: Multi-agent systems use multiple specialized agents that collaborate on a task, but complexity grows exponentially with each agent you add — communication overhead, error propagation, state management, and debugging difficulty all compound, so multi-agent is only justified when a single agent genuinely cannot handle the workload.

Navigation: <- 4.15.d When NOT to Use Agents | 4.15 Exercise Questions ->

1. What Is a Multi-Agent System?

A multi-agent system uses two or more agents, each with a specialized role, collaborating to accomplish a task that would be too complex or broad for a single agent.

┌──────────────────────────────────────────────────────────────────────────┐
│                SINGLE AGENT vs MULTI-AGENT                               │
│                                                                          │
│  SINGLE AGENT:                                                           │
│                                                                          │
│    ┌───────────────────────────────────────┐                             │
│    │          ONE AGENT                     │                             │
│    │  - Handles ALL reasoning               │                             │
│    │  - Uses ALL tools                      │                             │
│    │  - Manages ALL state                   │                             │
│    │  - One system prompt                   │                             │
│    └───────────────────────────────────────┘                             │
│                                                                          │
│  MULTI-AGENT:                                                            │
│                                                                          │
│    ┌─────────────┐   ┌─────────────┐   ┌─────────────┐                  │
│    │  RESEARCHER  │   │   ANALYST   │   │   WRITER    │                  │
│    │  Agent       │◄─►│   Agent     │◄─►│   Agent     │                  │
│    │              │   │             │   │             │                  │
│    │  Tools:      │   │  Tools:     │   │  Tools:     │                  │
│    │  - web search│   │  - calculator│  │  - formatter│                  │
│    │  - read URL  │   │  - database │   │  - save file│                  │
│    └─────────────┘   └─────────────┘   └─────────────┘                  │
│           │                  │                  │                         │
│           └──────────────────┼──────────────────┘                         │
│                              │                                            │
│                     ┌────────────────┐                                    │
│                     │  ORCHESTRATOR  │ (optional coordinator)             │
│                     │  assigns tasks │                                    │
│                     │  to agents     │                                    │
│                     └────────────────┘                                    │
│                                                                          │
│  Each agent has its own system prompt, tools, and area of expertise.     │
└──────────────────────────────────────────────────────────────────────────┘

Why multiple agents?

The core idea is specialization: just as a company has departments (sales, engineering, legal), a multi-agent system has specialized agents that are each expert in their domain.

Concept	Single Agent	Multi-Agent
Analogy	One person doing everything	A team of specialists
System prompt	One (potentially very long) prompt covering all capabilities	Multiple focused prompts, one per agent
Tools	All tools registered with one agent	Each agent has only the tools it needs
Context window	Must fit everything	Each agent manages its own context
Expertise	Generalist	Specialist per agent

2. Why Complexity Grows Exponentially

This is the central lesson of this section. Adding agents does not add complexity linearly — it adds complexity exponentially.

Communication paths

┌──────────────────────────────────────────────────────────────────────┐
│              COMMUNICATION COMPLEXITY                                 │
│                                                                      │
│  Agents     Communication Paths     Formula                          │
│  ─────      ────────────────────    ────────                         │
│  1 agent    0 paths (talks to self) n(n-1)/2                        │
│  2 agents   1 path  (A <-> B)      2(1)/2 = 1                      │
│  3 agents   3 paths (A<->B, A<->C, B<->C)    3(2)/2 = 3            │
│  4 agents   6 paths                4(3)/2 = 6                       │
│  5 agents   10 paths               5(4)/2 = 10                      │
│  10 agents  45 paths               10(9)/2 = 45                     │
│                                                                      │
│  Visual (3 agents):     Visual (5 agents):                           │
│                                                                      │
│      A                       A ─── B                                 │
│     / \                     /|\ / |\                                 │
│    B ─ C                   C─┼─D─┼─E                                 │
│                             \|/  |/                                   │
│   3 paths                    (many paths)                            │
│                              10 paths                                │
│                                                                      │
│  Each path is a potential source of:                                  │
│  - Miscommunication (lossy information transfer)                     │
│  - Latency (waiting for another agent)                               │
│  - Error propagation (bad data flowing between agents)               │
└──────────────────────────────────────────────────────────────────────┘

The multiplication of failure modes

Single agent, 5-step task:
  P(all steps succeed) = 0.95^5 = 0.774 (77.4% success)

Multi-agent (3 agents), 5 steps each, plus 3 handoffs:
  P(agent1 succeeds) = 0.95^5 = 0.774
  P(agent2 succeeds) = 0.95^5 = 0.774
  P(agent3 succeeds) = 0.95^5 = 0.774
  P(all handoffs succeed) = 0.95^3 = 0.857
  P(everything works) = 0.774 * 0.774 * 0.774 * 0.857 = 0.398

  39.8% overall success rate!

  Compare: 77.4% (single agent) vs 39.8% (multi-agent)

3. Communication Overhead

When agents communicate, information is summarized and lossy. Each handoff between agents is a potential point of information loss or distortion.

// Multi-agent communication: the "telephone game" problem

// Agent 1 (Researcher) produces detailed findings
const researcherOutput = {
  findings: [
    { source: "Report A", data: "Revenue grew 12.3% YoY to $4.2B", confidence: 0.95 },
    { source: "Report B", data: "Operating margin declined 2.1pp to 18.7%", confidence: 0.88 },
    { source: "Report C", data: "New product line launched Q3, expected $500M revenue", confidence: 0.72 },
  ],
  notes: "Report C's estimate varies significantly across sources (range: $350M-$650M)",
};

// Handoff to Agent 2 (Analyst)
// The analyst receives a TEXT summary, not the structured data
// Some nuance is lost:
// - The confidence scores may be dropped
// - The note about estimate variance may be summarized away
// - Numbers may be rounded differently

// Agent 2 (Analyst) produces an analysis
const analystOutput = "Revenue grew 12% to $4.2B. Margin declined. New product expected at $500M.";
// Notice: confidence levels are gone, the nuance about the $500M estimate is gone

// Handoff to Agent 3 (Writer)
// Even more information is lost. The writer sees "expected at $500M"
// and treats it as a firm number, not a rough estimate with wide variance.

// FINAL OUTPUT (from Writer Agent):
// "The company expects $500M from its new product line..."
// This sounds definitive, but the original data had a 72% confidence
// and a $350M-$650M range. The multi-agent chain lost the nuance.

Information loss at each handoff

Handoff	What Gets Lost
Agent A -> Agent B	Nuance, confidence levels, edge cases, caveats
Agent B -> Agent C	Even more context, plus Agent B's reasoning process
Agent C -> Agent D	Original source details, methodology, uncertainty ranges

The more agents, the more lossy the information chain becomes. This is analogous to the "telephone game" where a message degrades as it passes through more people.

4. Error Propagation Across Agents

When one agent makes an error, the downstream agents do not just inherit the error — they build on it, potentially amplifying it.

┌──────────────────────────────────────────────────────────────────┐
│                  ERROR PROPAGATION                                │
│                                                                  │
│  Agent 1 (Researcher):                                           │
│    Searches web -> Finds article with wrong date                 │
│    Output: "Product launched March 2024"                         │
│    (Actual: March 2023 — agent picked wrong article)             │
│                                                                  │
│           │ error flows downstream                               │
│           ▼                                                      │
│                                                                  │
│  Agent 2 (Analyst):                                              │
│    Receives "launched March 2024"                                │
│    Calculates: "Product is 1 month old" (correct math,          │
│    but based on wrong input)                                     │
│    Concludes: "Too early to assess market impact"                │
│                                                                  │
│           │ error compounds                                      │
│           ▼                                                      │
│                                                                  │
│  Agent 3 (Writer):                                               │
│    Writes: "The recently launched product (March 2024) is too    │
│    new to evaluate. We recommend waiting 6 months."              │
│                                                                  │
│  ACTUAL REALITY: The product has been on the market for          │
│  13 months and has substantial market data available.            │
│                                                                  │
│  The initial small error (1 year off) compounded into a          │
│  completely wrong strategic recommendation.                      │
└──────────────────────────────────────────────────────────────────┘

Why error propagation is worse in multi-agent systems

In a single agent, the same LLM that made the error might catch it in a later iteration (because it has the full context). In a multi-agent system, Agent 2 does not know Agent 1's raw sources — it only sees Agent 1's summary. It has no way to catch the error.

// Mitigation: Verification agent
// Add an agent whose ONLY job is to fact-check other agents' outputs

const verificationAgent = new Agent({
  systemPrompt: `You are a fact-checking agent.
Given a claim and its source, verify the claim by independently searching for the same information.
Flag any discrepancies.

Output JSON:
{
  "claim": "the claim being checked",
  "verified": true/false,
  "discrepancies": ["list of issues found"],
  "confidence": 0-1
}`,
  tools: [webSearchTool],
  maxIterations: 5,
});

// But note: this adds another agent, more cost, more latency,
// and the verification agent can ALSO make errors!

5. State Management Challenges

In a single agent, all state lives in one message array. In a multi-agent system, state is distributed across multiple agents, creating synchronization challenges.

// Single agent: state is straightforward
const singleAgentState = {
  messages: [], // One message array, one source of truth
};

// Multi-agent: state is distributed
const multiAgentState = {
  orchestrator: {
    taskPlan: [],
    agentAssignments: {},
    completedSteps: [],
  },
  researcher: {
    messages: [],
    findings: [],
    searchesPerformed: [],
  },
  analyst: {
    messages: [],
    dataSources: [],
    calculations: [],
  },
  writer: {
    messages: [],
    drafts: [],
    revisions: [],
  },
  shared: {
    // What state is shared between agents?
    // How do you keep it in sync?
    // What happens if two agents update it simultaneously?
    taskContext: {},
    intermediateResults: {},
  },
};

State management problems

Problem	Description	Consequence
Stale state	Agent B reads shared state before Agent A finishes writing	Agent B works with incomplete data
Conflicting updates	Agent A and Agent B both update the same shared state	One agent's work is overwritten
Context drift	Each agent's understanding of the task diverges over time	Agents work at cross purposes
Lost context	Information from Agent A's reasoning is not passed to Agent B	Agent B repeats work or makes errors
Exploding state size	Every agent's output is potentially input for every other agent	Memory/token limits hit faster

6. Cost and Latency Amplification

Multi-agent systems amplify the already-expensive cost of agents:

┌──────────────────────────────────────────────────────────────────┐
│              COST AND LATENCY COMPARISON                          │
│                                                                  │
│  Single LLM call:                                                │
│    Tokens:   2,000                                               │
│    Cost:     $0.005                                              │
│    Latency:  2 seconds                                           │
│                                                                  │
│  Single agent (5 iterations):                                    │
│    Tokens:   25,000                                              │
│    Cost:     $0.06                                               │
│    Latency:  15 seconds                                          │
│                                                                  │
│  Multi-agent (3 agents, 5 iterations each):                      │
│    Agent 1:  25,000 tokens                                       │
│    Agent 2:  30,000 tokens (receives Agent 1's output)           │
│    Agent 3:  35,000 tokens (receives Agents 1+2 output)          │
│    Orchestrator: 10,000 tokens (coordination overhead)           │
│    Total:    100,000 tokens                                      │
│    Cost:     $0.25                                               │
│    Latency:  45 seconds (if sequential)                          │
│              20 seconds (if agents run in parallel where possible)│
│                                                                  │
│  That is 50x more expensive and 10-20x slower than a single call.│
└──────────────────────────────────────────────────────────────────┘

Parallelism can help with latency (but not cost)

// Sequential multi-agent: slowest
async function sequentialMultiAgent(task) {
  const research = await researcherAgent.run(task);          // 15 seconds
  const analysis = await analystAgent.run(research);          // 15 seconds
  const report = await writerAgent.run(analysis);             // 15 seconds
  return report;                                              // Total: 45 seconds
}

// Parallel where possible: faster but same cost
async function parallelMultiAgent(task) {
  // Phase 1: Independent research (can be parallel)
  const [marketResearch, techResearch, competitorResearch] = await Promise.all([
    marketResearcherAgent.run(task),     // 15 seconds
    techResearcherAgent.run(task),       // 15 seconds
    competitorResearcherAgent.run(task), // 15 seconds
  ]);
  // Phase 1 total: 15 seconds (parallel)

  // Phase 2: Analysis (depends on Phase 1)
  const analysis = await analystAgent.run({
    market: marketResearch,
    tech: techResearch,
    competitor: competitorResearch,
  }); // 15 seconds

  // Phase 3: Writing (depends on Phase 2)
  const report = await writerAgent.run(analysis); // 15 seconds

  return report; // Total: 45 seconds -> reduced to ~30 seconds with parallel Phase 1
  // But token cost is IDENTICAL
}

7. When Multi-Agent IS Justified

Despite all the above warnings, there are legitimate cases for multi-agent systems:

Legitimate use case 1: Genuinely different expertise domains

When a task requires knowledge and tools from domains so different that one system prompt cannot cover them all effectively.

// Example: Comprehensive product launch analysis
// Each agent needs DIFFERENT tools and DIFFERENT expertise

const legalAgent = new Agent({
  systemPrompt: "You are a legal compliance expert. Check for regulatory issues.",
  tools: [regulatoryDatabaseTool, complianceCheckerTool],
});

const marketingAgent = new Agent({
  systemPrompt: "You are a marketing strategist. Analyze market positioning.",
  tools: [socialMediaAnalyticsTool, competitorTrackingTool],
});

const engineeringAgent = new Agent({
  systemPrompt: "You are a technical architect. Assess technical feasibility.",
  tools: [codeAnalysisTool, performanceBenchmarkTool],
});

// These are genuinely different domains with different tools.
// A single agent with all these tools would have a system prompt
// so long that the LLM would struggle to use it effectively.

Legitimate use case 2: Adversarial quality control

When you want one agent to check another agent's work (like peer review).

// The "writer + editor" pattern
const writerAgent = new Agent({
  systemPrompt: "Write a detailed technical report on the given topic.",
  tools: [searchTool, calculatorTool],
});

const editorAgent = new Agent({
  systemPrompt: `Review the following report for:
1. Factual accuracy (verify key claims)
2. Logical consistency
3. Completeness (are important aspects missing?)
4. Clarity of writing

Return a critique with specific issues and suggestions.`,
  tools: [searchTool], // Can independently verify claims
});

async function writeAndReview(topic) {
  const draft = await writerAgent.run(`Write a report on: ${topic}`);
  const critique = await editorAgent.run(`Review this report:\n\n${draft}`);

  if (critique.includes("MAJOR ISSUES")) {
    const revision = await writerAgent.run(
      `Revise this report based on the critique:\n\nOriginal: ${draft}\n\nCritique: ${critique}`
    );
    return revision;
  }

  return draft;
}

Legitimate use case 3: Scale beyond one agent's capability

When the task involves so many sub-tasks that one agent would exceed context limits or take too many iterations.

The decision checklist for multi-agent

┌──────────────────────────────────────────────────────────────────┐
│            DO YOU NEED MULTI-AGENT?                               │
│                                                                  │
│  [ ] A single agent cannot handle this because:                  │
│      [ ] Too many tools (>15) for one system prompt              │
│      [ ] Domain expertise is too diverse                         │
│      [ ] Context window overflow (too much state)                │
│      [ ] Quality improves with adversarial review                │
│                                                                  │
│  [ ] You have accepted the costs:                                │
│      [ ] 50x+ more expensive than a single call                 │
│      [ ] 30-300 seconds latency                                  │
│      [ ] ~40% base error rate (needs mitigation)                 │
│      [ ] Significant debugging complexity                        │
│      [ ] Engineering time to build and maintain                  │
│                                                                  │
│  [ ] You have mitigation strategies for:                         │
│      [ ] Error propagation (verification steps)                  │
│      [ ] State management (shared state protocol)                │
│      [ ] Information loss at handoffs (structured handoffs)      │
│      [ ] Cost control (budgets per agent)                        │
│                                                                  │
│  If any box is unchecked, reconsider.                            │
└──────────────────────────────────────────────────────────────────┘

8. When Multi-Agent Is Overkill

Most tasks that people think need multi-agent can be handled by a single agent with a good system prompt and the right tools.

"Multi-Agent" Proposal	Why It's Overkill	Better Approach
"A researcher agent and a summarizer agent"	Summarization is one LLM call, not an agent	Single agent that researches, then summarizes
"A planner agent and an executor agent"	Plan-then-execute is one agent with two phases	Single agent with explicit planning step
"A coder agent and a tester agent"	Testing is a tool call, not an agent	Single agent with a code-execution tool
"One agent per API we call"	Tools already separate API calls	Single agent with multiple tools
"A manager agent that delegates to worker agents"	The "manager" is just a system prompt	Single agent with a good system prompt

// OVER-ENGINEERED: 3 agents for a simple task
const researchAgent = new Agent({ /* ... */ });
const summaryAgent = new Agent({ /* ... */ });
const formatterAgent = new Agent({ /* ... */ });

async function overEngineered(topic) {
  const research = await researchAgent.run(topic);
  const summary = await summaryAgent.run(research);
  const formatted = await formatterAgent.run(summary);
  return formatted;
}

// BETTER: 1 agent that does all three
const singleAgent = new Agent({
  systemPrompt: `You are a research assistant.
1. Research the given topic using web search
2. Synthesize findings into a clear summary
3. Format as a well-structured report with headings and bullet points`,
  tools: [searchTool, readUrlTool],
  maxIterations: 10,
});

async function simple(topic) {
  return await singleAgent.run(topic);
}

9. Preview: Building Multi-Agent Systems (Sections 4.18-4.19)

This section has focused on understanding why multi-agent systems are complex and when they are justified. In later sections, you will learn how to build them properly:

Section	What You'll Learn
4.18 — Building a Simple Multi-Agent Workflow	Hands-on: build a 2-agent system (researcher + writer) with structured handoffs, error handling, and quality checks
4.19 — Multi-Agent Architecture Concerns	Deep dive into production concerns: state management, error recovery, cost control, monitoring, and scaling multi-agent systems

Important: Do not skip to 4.18 without fully understanding sections 4.15-4.17. The architectural patterns in 4.16 and the practical tools in 4.17 are prerequisites for building reliable multi-agent systems.

10. Key Takeaways

Multi-agent systems use multiple specialized agents collaborating on a task. Each agent has its own system prompt, tools, and area of expertise.
Complexity grows exponentially with each agent. Communication paths increase as n(n-1)/2. Error rates compound multiplicatively. Cost and latency multiply.
Communication overhead is real: information is lost at every handoff between agents. Like the telephone game, nuance, confidence levels, and caveats degrade.
Error propagation is the most dangerous problem. One agent's mistake becomes the foundation for downstream agents' reasoning, potentially leading to catastrophically wrong outputs.
State management is hard in distributed agent systems. Stale state, conflicting updates, and context drift are common problems.
Multi-agent is justified when: domains are genuinely different, adversarial review improves quality, or a single agent cannot handle the scale.
Multi-agent is overkill when: a single agent with multiple tools solves the problem, or when the "agents" are really just prompt chain steps disguised as agents.
Always start with a single agent. Add more agents only when you have evidence that the single agent approach is insufficient. Complexity is a one-way door — easy to add, hard to remove.

Explain-It Challenge

Draw the communication paths for a 4-agent system. Calculate the total paths. Now imagine each path has a 5% chance of miscommunication. What is the probability that at least one miscommunication occurs?
A colleague proposes a multi-agent system with: a "planner agent," a "researcher agent," a "calculator agent," and a "writer agent." Argue that this should be a single agent and explain how to consolidate it.
Describe a scenario where multi-agent is genuinely the right approach. Explain why a single agent would fail and how the multi-agent design addresses the specific limitation.

Navigation: <- 4.15.d When NOT to Use Agents | 4.15 Exercise Questions ->