Episode 4 — Generative AI Engineering / 4.15 — Understanding AI Agents
4.15.e — Multi-Agent Complexity
In one sentence: Multi-agent systems use multiple specialized agents that collaborate on a task, but complexity grows exponentially with each agent you add — communication overhead, error propagation, state management, and debugging difficulty all compound, so multi-agent is only justified when a single agent genuinely cannot handle the workload.
Navigation: <- 4.15.d When NOT to Use Agents | 4.15 Exercise Questions ->
1. What Is a Multi-Agent System?
A multi-agent system uses two or more agents, each with a specialized role, collaborating to accomplish a task that would be too complex or broad for a single agent.
┌──────────────────────────────────────────────────────────────────────────┐
│ SINGLE AGENT vs MULTI-AGENT │
│ │
│ SINGLE AGENT: │
│ │
│ ┌───────────────────────────────────────┐ │
│ │ ONE AGENT │ │
│ │ - Handles ALL reasoning │ │
│ │ - Uses ALL tools │ │
│ │ - Manages ALL state │ │
│ │ - One system prompt │ │
│ └───────────────────────────────────────┘ │
│ │
│ MULTI-AGENT: │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ RESEARCHER │ │ ANALYST │ │ WRITER │ │
│ │ Agent │◄─►│ Agent │◄─►│ Agent │ │
│ │ │ │ │ │ │ │
│ │ Tools: │ │ Tools: │ │ Tools: │ │
│ │ - web search│ │ - calculator│ │ - formatter│ │
│ │ - read URL │ │ - database │ │ - save file│ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ │ │ │ │
│ └──────────────────┼──────────────────┘ │
│ │ │
│ ┌────────────────┐ │
│ │ ORCHESTRATOR │ (optional coordinator) │
│ │ assigns tasks │ │
│ │ to agents │ │
│ └────────────────┘ │
│ │
│ Each agent has its own system prompt, tools, and area of expertise. │
└──────────────────────────────────────────────────────────────────────────┘
Why multiple agents?
The core idea is specialization: just as a company has departments (sales, engineering, legal), a multi-agent system has specialized agents that are each expert in their domain.
| Concept | Single Agent | Multi-Agent |
|---|---|---|
| Analogy | One person doing everything | A team of specialists |
| System prompt | One (potentially very long) prompt covering all capabilities | Multiple focused prompts, one per agent |
| Tools | All tools registered with one agent | Each agent has only the tools it needs |
| Context window | Must fit everything | Each agent manages its own context |
| Expertise | Generalist | Specialist per agent |
2. Why Complexity Grows Exponentially
This is the central lesson of this section. Adding agents does not add complexity linearly — it adds complexity exponentially.
Communication paths
┌──────────────────────────────────────────────────────────────────────┐
│ COMMUNICATION COMPLEXITY │
│ │
│ Agents Communication Paths Formula │
│ ───── ──────────────────── ──────── │
│ 1 agent 0 paths (talks to self) n(n-1)/2 │
│ 2 agents 1 path (A <-> B) 2(1)/2 = 1 │
│ 3 agents 3 paths (A<->B, A<->C, B<->C) 3(2)/2 = 3 │
│ 4 agents 6 paths 4(3)/2 = 6 │
│ 5 agents 10 paths 5(4)/2 = 10 │
│ 10 agents 45 paths 10(9)/2 = 45 │
│ │
│ Visual (3 agents): Visual (5 agents): │
│ │
│ A A ─── B │
│ / \ /|\ / |\ │
│ B ─ C C─┼─D─┼─E │
│ \|/ |/ │
│ 3 paths (many paths) │
│ 10 paths │
│ │
│ Each path is a potential source of: │
│ - Miscommunication (lossy information transfer) │
│ - Latency (waiting for another agent) │
│ - Error propagation (bad data flowing between agents) │
└──────────────────────────────────────────────────────────────────────┘
The multiplication of failure modes
Single agent, 5-step task:
P(all steps succeed) = 0.95^5 = 0.774 (77.4% success)
Multi-agent (3 agents), 5 steps each, plus 3 handoffs:
P(agent1 succeeds) = 0.95^5 = 0.774
P(agent2 succeeds) = 0.95^5 = 0.774
P(agent3 succeeds) = 0.95^5 = 0.774
P(all handoffs succeed) = 0.95^3 = 0.857
P(everything works) = 0.774 * 0.774 * 0.774 * 0.857 = 0.398
39.8% overall success rate!
Compare: 77.4% (single agent) vs 39.8% (multi-agent)
3. Communication Overhead
When agents communicate, information is summarized and lossy. Each handoff between agents is a potential point of information loss or distortion.
// Multi-agent communication: the "telephone game" problem
// Agent 1 (Researcher) produces detailed findings
const researcherOutput = {
findings: [
{ source: "Report A", data: "Revenue grew 12.3% YoY to $4.2B", confidence: 0.95 },
{ source: "Report B", data: "Operating margin declined 2.1pp to 18.7%", confidence: 0.88 },
{ source: "Report C", data: "New product line launched Q3, expected $500M revenue", confidence: 0.72 },
],
notes: "Report C's estimate varies significantly across sources (range: $350M-$650M)",
};
// Handoff to Agent 2 (Analyst)
// The analyst receives a TEXT summary, not the structured data
// Some nuance is lost:
// - The confidence scores may be dropped
// - The note about estimate variance may be summarized away
// - Numbers may be rounded differently
// Agent 2 (Analyst) produces an analysis
const analystOutput = "Revenue grew 12% to $4.2B. Margin declined. New product expected at $500M.";
// Notice: confidence levels are gone, the nuance about the $500M estimate is gone
// Handoff to Agent 3 (Writer)
// Even more information is lost. The writer sees "expected at $500M"
// and treats it as a firm number, not a rough estimate with wide variance.
// FINAL OUTPUT (from Writer Agent):
// "The company expects $500M from its new product line..."
// This sounds definitive, but the original data had a 72% confidence
// and a $350M-$650M range. The multi-agent chain lost the nuance.
Information loss at each handoff
| Handoff | What Gets Lost |
|---|---|
| Agent A -> Agent B | Nuance, confidence levels, edge cases, caveats |
| Agent B -> Agent C | Even more context, plus Agent B's reasoning process |
| Agent C -> Agent D | Original source details, methodology, uncertainty ranges |
The more agents, the more lossy the information chain becomes. This is analogous to the "telephone game" where a message degrades as it passes through more people.
4. Error Propagation Across Agents
When one agent makes an error, the downstream agents do not just inherit the error — they build on it, potentially amplifying it.
┌──────────────────────────────────────────────────────────────────┐
│ ERROR PROPAGATION │
│ │
│ Agent 1 (Researcher): │
│ Searches web -> Finds article with wrong date │
│ Output: "Product launched March 2024" │
│ (Actual: March 2023 — agent picked wrong article) │
│ │
│ │ error flows downstream │
│ ▼ │
│ │
│ Agent 2 (Analyst): │
│ Receives "launched March 2024" │
│ Calculates: "Product is 1 month old" (correct math, │
│ but based on wrong input) │
│ Concludes: "Too early to assess market impact" │
│ │
│ │ error compounds │
│ ▼ │
│ │
│ Agent 3 (Writer): │
│ Writes: "The recently launched product (March 2024) is too │
│ new to evaluate. We recommend waiting 6 months." │
│ │
│ ACTUAL REALITY: The product has been on the market for │
│ 13 months and has substantial market data available. │
│ │
│ The initial small error (1 year off) compounded into a │
│ completely wrong strategic recommendation. │
└──────────────────────────────────────────────────────────────────┘
Why error propagation is worse in multi-agent systems
In a single agent, the same LLM that made the error might catch it in a later iteration (because it has the full context). In a multi-agent system, Agent 2 does not know Agent 1's raw sources — it only sees Agent 1's summary. It has no way to catch the error.
// Mitigation: Verification agent
// Add an agent whose ONLY job is to fact-check other agents' outputs
const verificationAgent = new Agent({
systemPrompt: `You are a fact-checking agent.
Given a claim and its source, verify the claim by independently searching for the same information.
Flag any discrepancies.
Output JSON:
{
"claim": "the claim being checked",
"verified": true/false,
"discrepancies": ["list of issues found"],
"confidence": 0-1
}`,
tools: [webSearchTool],
maxIterations: 5,
});
// But note: this adds another agent, more cost, more latency,
// and the verification agent can ALSO make errors!
5. State Management Challenges
In a single agent, all state lives in one message array. In a multi-agent system, state is distributed across multiple agents, creating synchronization challenges.
// Single agent: state is straightforward
const singleAgentState = {
messages: [], // One message array, one source of truth
};
// Multi-agent: state is distributed
const multiAgentState = {
orchestrator: {
taskPlan: [],
agentAssignments: {},
completedSteps: [],
},
researcher: {
messages: [],
findings: [],
searchesPerformed: [],
},
analyst: {
messages: [],
dataSources: [],
calculations: [],
},
writer: {
messages: [],
drafts: [],
revisions: [],
},
shared: {
// What state is shared between agents?
// How do you keep it in sync?
// What happens if two agents update it simultaneously?
taskContext: {},
intermediateResults: {},
},
};
State management problems
| Problem | Description | Consequence |
|---|---|---|
| Stale state | Agent B reads shared state before Agent A finishes writing | Agent B works with incomplete data |
| Conflicting updates | Agent A and Agent B both update the same shared state | One agent's work is overwritten |
| Context drift | Each agent's understanding of the task diverges over time | Agents work at cross purposes |
| Lost context | Information from Agent A's reasoning is not passed to Agent B | Agent B repeats work or makes errors |
| Exploding state size | Every agent's output is potentially input for every other agent | Memory/token limits hit faster |
6. Cost and Latency Amplification
Multi-agent systems amplify the already-expensive cost of agents:
┌──────────────────────────────────────────────────────────────────┐
│ COST AND LATENCY COMPARISON │
│ │
│ Single LLM call: │
│ Tokens: 2,000 │
│ Cost: $0.005 │
│ Latency: 2 seconds │
│ │
│ Single agent (5 iterations): │
│ Tokens: 25,000 │
│ Cost: $0.06 │
│ Latency: 15 seconds │
│ │
│ Multi-agent (3 agents, 5 iterations each): │
│ Agent 1: 25,000 tokens │
│ Agent 2: 30,000 tokens (receives Agent 1's output) │
│ Agent 3: 35,000 tokens (receives Agents 1+2 output) │
│ Orchestrator: 10,000 tokens (coordination overhead) │
│ Total: 100,000 tokens │
│ Cost: $0.25 │
│ Latency: 45 seconds (if sequential) │
│ 20 seconds (if agents run in parallel where possible)│
│ │
│ That is 50x more expensive and 10-20x slower than a single call.│
└──────────────────────────────────────────────────────────────────┘
Parallelism can help with latency (but not cost)
// Sequential multi-agent: slowest
async function sequentialMultiAgent(task) {
const research = await researcherAgent.run(task); // 15 seconds
const analysis = await analystAgent.run(research); // 15 seconds
const report = await writerAgent.run(analysis); // 15 seconds
return report; // Total: 45 seconds
}
// Parallel where possible: faster but same cost
async function parallelMultiAgent(task) {
// Phase 1: Independent research (can be parallel)
const [marketResearch, techResearch, competitorResearch] = await Promise.all([
marketResearcherAgent.run(task), // 15 seconds
techResearcherAgent.run(task), // 15 seconds
competitorResearcherAgent.run(task), // 15 seconds
]);
// Phase 1 total: 15 seconds (parallel)
// Phase 2: Analysis (depends on Phase 1)
const analysis = await analystAgent.run({
market: marketResearch,
tech: techResearch,
competitor: competitorResearch,
}); // 15 seconds
// Phase 3: Writing (depends on Phase 2)
const report = await writerAgent.run(analysis); // 15 seconds
return report; // Total: 45 seconds -> reduced to ~30 seconds with parallel Phase 1
// But token cost is IDENTICAL
}
7. When Multi-Agent IS Justified
Despite all the above warnings, there are legitimate cases for multi-agent systems:
Legitimate use case 1: Genuinely different expertise domains
When a task requires knowledge and tools from domains so different that one system prompt cannot cover them all effectively.
// Example: Comprehensive product launch analysis
// Each agent needs DIFFERENT tools and DIFFERENT expertise
const legalAgent = new Agent({
systemPrompt: "You are a legal compliance expert. Check for regulatory issues.",
tools: [regulatoryDatabaseTool, complianceCheckerTool],
});
const marketingAgent = new Agent({
systemPrompt: "You are a marketing strategist. Analyze market positioning.",
tools: [socialMediaAnalyticsTool, competitorTrackingTool],
});
const engineeringAgent = new Agent({
systemPrompt: "You are a technical architect. Assess technical feasibility.",
tools: [codeAnalysisTool, performanceBenchmarkTool],
});
// These are genuinely different domains with different tools.
// A single agent with all these tools would have a system prompt
// so long that the LLM would struggle to use it effectively.
Legitimate use case 2: Adversarial quality control
When you want one agent to check another agent's work (like peer review).
// The "writer + editor" pattern
const writerAgent = new Agent({
systemPrompt: "Write a detailed technical report on the given topic.",
tools: [searchTool, calculatorTool],
});
const editorAgent = new Agent({
systemPrompt: `Review the following report for:
1. Factual accuracy (verify key claims)
2. Logical consistency
3. Completeness (are important aspects missing?)
4. Clarity of writing
Return a critique with specific issues and suggestions.`,
tools: [searchTool], // Can independently verify claims
});
async function writeAndReview(topic) {
const draft = await writerAgent.run(`Write a report on: ${topic}`);
const critique = await editorAgent.run(`Review this report:\n\n${draft}`);
if (critique.includes("MAJOR ISSUES")) {
const revision = await writerAgent.run(
`Revise this report based on the critique:\n\nOriginal: ${draft}\n\nCritique: ${critique}`
);
return revision;
}
return draft;
}
Legitimate use case 3: Scale beyond one agent's capability
When the task involves so many sub-tasks that one agent would exceed context limits or take too many iterations.
The decision checklist for multi-agent
┌──────────────────────────────────────────────────────────────────┐
│ DO YOU NEED MULTI-AGENT? │
│ │
│ [ ] A single agent cannot handle this because: │
│ [ ] Too many tools (>15) for one system prompt │
│ [ ] Domain expertise is too diverse │
│ [ ] Context window overflow (too much state) │
│ [ ] Quality improves with adversarial review │
│ │
│ [ ] You have accepted the costs: │
│ [ ] 50x+ more expensive than a single call │
│ [ ] 30-300 seconds latency │
│ [ ] ~40% base error rate (needs mitigation) │
│ [ ] Significant debugging complexity │
│ [ ] Engineering time to build and maintain │
│ │
│ [ ] You have mitigation strategies for: │
│ [ ] Error propagation (verification steps) │
│ [ ] State management (shared state protocol) │
│ [ ] Information loss at handoffs (structured handoffs) │
│ [ ] Cost control (budgets per agent) │
│ │
│ If any box is unchecked, reconsider. │
└──────────────────────────────────────────────────────────────────┘
8. When Multi-Agent Is Overkill
Most tasks that people think need multi-agent can be handled by a single agent with a good system prompt and the right tools.
| "Multi-Agent" Proposal | Why It's Overkill | Better Approach |
|---|---|---|
| "A researcher agent and a summarizer agent" | Summarization is one LLM call, not an agent | Single agent that researches, then summarizes |
| "A planner agent and an executor agent" | Plan-then-execute is one agent with two phases | Single agent with explicit planning step |
| "A coder agent and a tester agent" | Testing is a tool call, not an agent | Single agent with a code-execution tool |
| "One agent per API we call" | Tools already separate API calls | Single agent with multiple tools |
| "A manager agent that delegates to worker agents" | The "manager" is just a system prompt | Single agent with a good system prompt |
// OVER-ENGINEERED: 3 agents for a simple task
const researchAgent = new Agent({ /* ... */ });
const summaryAgent = new Agent({ /* ... */ });
const formatterAgent = new Agent({ /* ... */ });
async function overEngineered(topic) {
const research = await researchAgent.run(topic);
const summary = await summaryAgent.run(research);
const formatted = await formatterAgent.run(summary);
return formatted;
}
// BETTER: 1 agent that does all three
const singleAgent = new Agent({
systemPrompt: `You are a research assistant.
1. Research the given topic using web search
2. Synthesize findings into a clear summary
3. Format as a well-structured report with headings and bullet points`,
tools: [searchTool, readUrlTool],
maxIterations: 10,
});
async function simple(topic) {
return await singleAgent.run(topic);
}
9. Preview: Building Multi-Agent Systems (Sections 4.18-4.19)
This section has focused on understanding why multi-agent systems are complex and when they are justified. In later sections, you will learn how to build them properly:
| Section | What You'll Learn |
|---|---|
| 4.18 — Building a Simple Multi-Agent Workflow | Hands-on: build a 2-agent system (researcher + writer) with structured handoffs, error handling, and quality checks |
| 4.19 — Multi-Agent Architecture Concerns | Deep dive into production concerns: state management, error recovery, cost control, monitoring, and scaling multi-agent systems |
Important: Do not skip to 4.18 without fully understanding sections 4.15-4.17. The architectural patterns in 4.16 and the practical tools in 4.17 are prerequisites for building reliable multi-agent systems.
10. Key Takeaways
- Multi-agent systems use multiple specialized agents collaborating on a task. Each agent has its own system prompt, tools, and area of expertise.
- Complexity grows exponentially with each agent. Communication paths increase as n(n-1)/2. Error rates compound multiplicatively. Cost and latency multiply.
- Communication overhead is real: information is lost at every handoff between agents. Like the telephone game, nuance, confidence levels, and caveats degrade.
- Error propagation is the most dangerous problem. One agent's mistake becomes the foundation for downstream agents' reasoning, potentially leading to catastrophically wrong outputs.
- State management is hard in distributed agent systems. Stale state, conflicting updates, and context drift are common problems.
- Multi-agent is justified when: domains are genuinely different, adversarial review improves quality, or a single agent cannot handle the scale.
- Multi-agent is overkill when: a single agent with multiple tools solves the problem, or when the "agents" are really just prompt chain steps disguised as agents.
- Always start with a single agent. Add more agents only when you have evidence that the single agent approach is insufficient. Complexity is a one-way door — easy to add, hard to remove.
Explain-It Challenge
- Draw the communication paths for a 4-agent system. Calculate the total paths. Now imagine each path has a 5% chance of miscommunication. What is the probability that at least one miscommunication occurs?
- A colleague proposes a multi-agent system with: a "planner agent," a "researcher agent," a "calculator agent," and a "writer agent." Argue that this should be a single agent and explain how to consolidate it.
- Describe a scenario where multi-agent is genuinely the right approach. Explain why a single agent would fail and how the multi-agent design addresses the specific limitation.
Navigation: <- 4.15.d When NOT to Use Agents | 4.15 Exercise Questions ->