4.18 — Building a Simple Multi-Agent Workflow: Quick Revision
Compact cheat sheet. Print-friendly.
How to use this material (instructions)
- Skim before labs or interviews.
- Drill gaps -- reopen
README.md then 4.18.a...4.18.d.
- Practice --
4.18-Exercise-Questions.md.
- Polish answers --
4.18-Interview-Questions.md.
Core vocabulary
| Term | One-liner |
|---|
| Multi-agent pipeline | Multiple specialized agents connected in sequence, each with one job |
| Schema contract | Zod schema defining the exact shape of data between two agents |
| Sequential pipeline | Agents run one after another; latency = sum of all |
| Parallel pipeline | Independent agents run simultaneously; latency = max (slowest) |
| Fan-out/fan-in | Planner splits work, workers run in parallel, merger collects results |
| Conditional (router) | Router agent picks which pipeline branch to run |
| Single Responsibility | Each agent does one thing well -- analyze OR transform OR generate |
| Selective context | Each agent receives only the specific fields it needs |
| Accumulated context | Each agent receives original input + all previous outputs (token-heavy) |
| Validation feedback retry | Feeding Zod errors back to the LLM so it can self-correct |
| Fail fast | Stop pipeline on first error; never return bad data |
| Graceful degradation | Cascade through cheaper/simpler approaches when primary fails |
| Fallback response | Pre-defined, schema-valid output used when an agent fails |
| PipelineError | Custom error class with agentName, stepNumber, isRetryable |
Multi-agent pipeline architecture
┌────────────────────────────────────────────────────────────────────┐
│ Input ──► Agent 1 (Analyze) ──► Agent 2 (Transform) ──► Agent 3 │
│ │ │ (Generate)│
│ Zod Schema 1 Zod Schema 2 Zod Schema 3│
│ validates validates validates │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ Structured Structured FINAL OUTPUT │
│ JSON passed JSON passed (Structured │
│ to Agent 2 to Agent 3 JSON) │
└────────────────────────────────────────────────────────────────────┘
KEY INSIGHT: Same pattern for ANY domain.
Hinge: Profile Analyzer → Bio Improver → Conversation Starter
ImageKit: Metadata Extractor → SEO Optimizer → Tag Categorizer
Generic: Analyze → Transform → Generate
Pipeline patterns at a glance
| Pattern | Diagram | Latency | When to Use |
|---|
| Sequential | A → B → C | T_A + T_B + T_C | Each step needs previous output |
| Parallel | A, B, C (simultaneous) | max(T_A, T_B, T_C) | Agents are independent |
| Fan-out/fan-in | P → [W1, W2, W3] → M | T_P + max(T_W) + T_M | Dynamic sub-task decomposition |
| Conditional | Router → branch | T_R + T_branch | Different inputs, different paths |
const s1 = await agent1(input);
const s2 = await agent2(s1);
const s3 = await agent3(s2);
const [a, b, c] = await Promise.all([
agentA(input), agentB(input), agentC(input)
]);
const tasks = await planner(input);
const results = await Promise.all(tasks.map(t => worker(t)));
const merged = await merger(results);
const route = await router(input);
switch (route) { case "A": return pipelineA(input); ... }
Data flow patterns
DIRECT PASS-THROUGH:
Agent 1 output → Agent 2 input → Agent 3 input
Simple. Later agents lose original input.
ACCUMULATED CONTEXT:
Agent 2 gets: original + Agent 1 output
Agent 3 gets: original + Agent 1 output + Agent 2 output
Full context. Token-heavy. Can confuse agents.
SELECTIVE CONTEXT (recommended):
Agent 2 gets: original.bio + analysis.weaknesses + analysis.tips
Agent 3 gets: improvedBio + original.interests + original.name
Minimal tokens. Focused agents. Best results.
Validation between agents
Agent runs → JSON output → Parse JSON → Zod validates
│
┌────┴────┐
│ │
PASS FAIL
│ │
Continue Fail fast OR
pipeline retry with
validation feedback
What Zod catches
Missing fields: ZodError: Required at "weaknesses"
Wrong types: ZodError: Expected number, received string at "overallScore"
Invalid enums: ZodError: Invalid enum value at "category"
Out-of-range: ZodError: Number must be <= 10 at "overallScore"
Array too short: ZodError: Array must contain at least 1 element(s)
String too short: ZodError: String must contain at least 20 character(s)
Schema contract pattern
import { z } from 'zod';
const Agent1Output = z.object({
strengths: z.array(z.object({
category: z.enum(["bio", "interests", "photos"]),
description: z.string().min(10),
impactScore: z.number().min(1).max(10),
})).min(1),
overallScore: z.number().min(1).max(10),
});
const validated = Agent1Output.parse(parsed);
const result = Agent1Output.safeParse(parsed);
Error handling strategies
Five error types (easiest → hardest to detect)
1. LLM API error (HTTP 429, 500, 503, timeout)
2. Empty response (null, undefined, empty string)
3. JSON parse error (text instead of JSON, truncated JSON)
4. Zod validation error (missing fields, wrong types, out-of-range)
5. Semantic error (valid JSON, correct types, but WRONG content)
Three failure strategies
| Strategy | Returns | Best For |
|---|
| Fail fast | Nothing (error thrown) | Data pipelines, accuracy-critical |
| Partial results | Completed steps only | Development, debugging |
| Fallback | Always something | User-facing products |
Retry strategies
SIMPLE RETRY:
Retry N times with linear delay. No learning.
EXPONENTIAL BACKOFF:
Wait 1s, 2s, 4s, 8s + random jitter. For rate limits.
VALIDATION FEEDBACK (best for Zod errors):
Feed Zod error messages back to LLM as follow-up.
LLM reads error and self-corrects.
Does NOT help with API errors or semantic errors.
Graceful degradation ladder
Level 1: Full pipeline with GPT-4o (best quality)
Level 2: Full pipeline with GPT-4o-mini (faster, cheaper)
Level 3: Simplified pipeline (fewer agents) (reduced quality)
Level 4: Rule-based heuristics (no LLM) (basic but reliable)
Level 5: Static fallback (hardcoded default) (guaranteed to work)
Every level MUST produce output that passes the same Zod schema.
Temperature strategy
HINGE PIPELINE (dating profiles):
Agent 1 (Analyzer): 0.7 analytical but needs creative insight
Agent 2 (Bio Writer): 0.8 writing needs creativity
Agent 3 (Openers): 0.9 maximum creativity for conversation
Pattern: monotonically increasing (progressively more creative)
IMAGEKIT PIPELINE (image SEO):
Agent 1 (Extractor): 0.5 factual extraction, consistency matters
Agent 2 (SEO): 0.7 creative but accurate titles
Agent 3 (Tagger): 0.6 comprehensive but consistent tags
Pattern: non-monotonic (analytical → creative → balanced)
RULE: Temperature follows the task, not the position in the pipeline.
Factual/analytical → low (0.3-0.5)
Balanced → medium (0.6-0.7)
Creative/generative → high (0.8-0.9)
When to use multi-agent pipelines
USE MULTI-AGENT WHEN:
- Task has genuinely different reasoning steps
- A single prompt produces inconsistent results
- You need different models/temperatures per step
- You need independent testing per step
- Different team members own different steps
DON'T USE MULTI-AGENT WHEN:
- Single well-crafted prompt produces equivalent quality
- Latency budget is under 1-2 seconds
- Cost increase not justified by quality gain
- Task needs no LLM at all (code, regex, DB lookup)
AGENT COUNT GUIDELINES:
2 agents: Simple analyze → generate
3 agents: Standard analyze → transform → generate (most common)
4-5 agents: Complex workflows with distinct phases
6+ agents: Rare. Consider merging some.
Common gotchas
| Gotcha | Fix |
|---|
| No validation between agents | Zod .parse() after every agent |
| Passing entire state to every agent | Selective context -- each agent gets only what it needs |
| Same temperature for all agents | Tune per agent: analytical=low, creative=high |
Promise.all for batch processing | Use Promise.allSettled -- isolate failures per item |
| No logging in pipeline | Log agent name, duration, status, retries at every step |
| Monolithic "do everything" agent | Split by SRP -- one job per agent |
| Fallbacks that break downstream agents | Fallbacks must pass the same Zod schema |
| No retry for Zod errors | Validation feedback retry -- feed errors back to LLM |
| Merging agents that need different models | Keep separate if model/temperature needs differ |
| Building multi-agent before trying single prompt | Always benchmark single prompt first |
Reusable pipeline runner (minimal)
async function callAgent(name, prompt, input, schema, temp = 0.7) {
const res = await client.chat.completions.create({
model: "gpt-4o", temperature: temp,
messages: [
{ role: "system", content: prompt },
{ role: "user", content: JSON.stringify(input) },
],
});
const raw = res.choices[0].message.content;
let parsed;
try { parsed = JSON.parse(raw); } catch {
const m = raw.match(/```(?:json)?\s*([\s\S]*?)```/);
if (m) parsed = JSON.parse(m[1].trim());
else throw new Error(`${name}: invalid JSON`);
}
return schema.parse(parsed);
}
async function runPipeline(agents, input) {
let data = input;
for (const a of agents) {
data = await callAgent(a.name, a.prompt, data, a.schema, a.temp);
}
return data;
}
Testing checklist
UNIT (per agent):
[ ] Valid output matches Zod schema
[ ] Missing fields rejected
[ ] Invalid enums rejected
[ ] Out-of-range numbers rejected
[ ] Empty/null responses handled
[ ] JSON in markdown code blocks extracted
INTEGRATION (agent-to-agent):
[ ] Agent 1 output is valid Agent 2 input
[ ] Agent 2 output is valid Agent 3 input
[ ] Selective context extracts correct fields
END-TO-END (full pipeline):
[ ] Happy path produces valid final output
[ ] Each agent failure handled gracefully
[ ] Fallback responses pass schema validation
[ ] Pipeline metadata (duration, agentCount) correct
Quick mental model
Multi-agent pipeline =
Input
→ Agent 1 (analyze, validate with Zod)
→ Agent 2 (transform, validate with Zod)
→ Agent 3 (generate, validate with Zod)
→ Structured output
Each agent: one job, one schema, one temperature, independently testable.
Between agents: Zod validation, selective context, error handling.
Pattern is domain-agnostic: works for dating profiles, image SEO, code review, etc.
End of 4.18 quick revision.