Episode 4 — Generative AI Engineering / 4.15 — Understanding AI Agents

4.15 — Exercise Questions: Understanding AI Agents

Practice questions for all five subtopics in Section 4.15. Mix of conceptual, calculation, design, and hands-on tasks.

How to use this material (instructions)

Read lessons in order — README.md, then 4.15.a -> 4.15.e.
Answer closed-book first — then compare to the matching lesson.
Work through code examples — modify and run the agent code from 4.15.a and 4.15.b.
Interview prep — 4.15-Interview-Questions.md.
Quick review — 4.15-Quick-Revision.md.

4.15.a — Agent vs Single LLM Call (Q1–Q10)

Q1. Define what an AI agent is in one sentence. How does it differ from a single LLM call?

Q2. List the four defining characteristics of an AI agent. For each, explain what happens if you remove it.

Q3. What does ReAct stand for? Explain the ReAct cycle (Thought -> Action -> Observation) with a concrete example involving a weather lookup.

Q4. A single LLM call costs ~2,000 tokens. An agent that takes 5 steps accumulates roughly how many total tokens across all calls? Explain why the cost grows non-linearly.

Q5. Name five tasks where a single LLM call is sufficient and an agent would be wasteful. For each, explain why no loop or tools are needed.

Q6. Name five tasks where an agent is necessary and a single call would fail. For each, explain what external data or actions are required.

Q7. Draw (or describe) the complexity spectrum from single call to full agent. What sits in between?

Q8. Explain why the ReAct pattern (reasoning before acting) produces better tool usage than acting without reasoning. Give a concrete example where the difference matters.

Q9. Calculation: An agent makes 8 LLM calls to complete a task. Each call has a 5% chance of producing an error. What is the probability that at least one error occurs? What if each call had a 2% error rate?

Q10. Hands-on: Take the single-call sentiment classification code from 4.15.a and measure its token usage and latency. Now imagine wrapping it in an agent loop with 3 iterations. Estimate the total token cost and latency. Is the agent version justified?

4.15.b — Agent Architecture (Q11–Q20)

Q11. Name the four core components of an AI agent and explain the role of each in one sentence.

Q12. Why is the system prompt so important for an agent's LLM brain? What happens if the system prompt is too vague? Too long?

Q13. Explain tool registration. What information must each tool definition include for the LLM to use it correctly?

Q14. What is the difference between short-term memory and long-term memory in an agent? Give an example of data stored in each.

Q15. Short-term memory (message history) grows with every iteration. Describe two strategies for managing this growth.

Q16. Explain the difference between implicit planning, explicit planning, and adaptive planning. When would you use each?

Q17. Design: Design the tool set (names, descriptions, parameters) for a travel booking agent. Include at least 5 tools.

Q18. Why is security especially important for agents that use tools? Name three specific security risks and how to mitigate each.

Q19. Code analysis: Look at the Agent class implementation in 4.15.b. What happens if the LLM generates a tool call for a tool that does not exist? What happens if a tool throws an exception? Is the error handling sufficient?

Q20. Explain the "plan-and-execute" agent pattern. Why is it better than implicit planning for complex (10+ step) tasks?

4.15.c — When to Use Agents (Q21–Q30)

Q21. What is the single most reliable indicator that you need an agent instead of a single LLM call?

Q22. Explain why competitive analysis is a good use case for an agent. Walk through the agent's likely steps.

Q23. Why are research workflows natural fits for agents? What pattern do they follow?

Q24. Describe how a customer support agent differs from a customer support chatbot that uses a single LLM call. What specific capabilities does the agent add?

Q25. Using the decision framework from 4.15.c, classify each task: (a) Translate a paragraph to Spanish, (b) Look up an order and issue a refund, (c) Summarize an article then post it to Slack, (d) Research competitor pricing across 5 websites.

Q26. List 5 signals from the checklist in 4.15.c that indicate a task is a good fit for an agent. For each, give a concrete example.

Q27. Design: Design a data pipeline agent that ingests a CSV, cleans it, runs analysis, and generates a report. List the tools it needs and describe its likely execution path for a CSV with missing values and outliers.

Q28. Why is it important to measure agent success metrics? Name the 4 categories of metrics and 2 specific metrics in each category.

Q29. A task requires calling 3 APIs in sequence (always the same 3 APIs, in the same order). Should you build an agent for this? Why or why not?

Q30. Hands-on: Sketch the system prompt and tool definitions for a customer support agent for an online bookstore. Include tools for: order lookup, return processing, book search, and human escalation.

4.15.d — When NOT to Use Agents (Q31–Q40)

Q31. State the "golden rule" about building agents. In your own words, explain what it means practically.

Q32. Why is latency a major problem for agents in user-facing applications? What latency thresholds do users typically tolerate?

Q33. Calculation: A customer support agent averages 4 iterations per request. Each iteration uses an average of 5,000 tokens. At $2.50/$10.00 per 1M input/output tokens (GPT-4o pricing), estimate the cost per request. Now multiply by 100,000 requests/month. What is the monthly cost?

Q34. Give three examples where deterministic code is better than an AI agent. For each, explain why code is more reliable.

Q35. What are the six signs that you are over-engineering with an agent? List them and explain the first three in detail.

Q36. Explain the "pre-agent checklist" from 4.15.d. Walk through the checklist for this task: "Check if a user's subscription has expired and send them a renewal email."

Q37. A startup founder says: "We're building an AI agent for our FAQ page." Using what you learned in 4.15.d, explain why this is probably overkill and recommend a better approach.

Q38. What is the hybrid approach (try single call first, fall back to agent)? When is this a good strategy?

Q39. Design: A company processes 50,000 invoices per month. A developer proposes an agent that reads each invoice, validates it, and enters it into the accounting system. The agent takes 15 seconds and costs $0.12 per invoice. Calculate: (a) monthly cost, (b) total processing time, and (c) propose a cheaper alternative that still uses AI where appropriate.

Q40. Explain the "AI toolbox" mental model from 4.15.d. Why should you always start from the simplest tool and move down only when needed?

4.15.e — Multi-Agent Complexity (Q41–Q50)

Q41. Define what a multi-agent system is. How does it differ from a single agent with multiple tools?

Q42. Calculation: Using the formula n(n-1)/2, calculate the number of communication paths for: (a) 3 agents, (b) 5 agents, (c) 8 agents, (d) 10 agents.

Q43. Explain the "telephone game" problem in multi-agent systems. What kind of information is most likely to be lost at each handoff?

Q44. Describe error propagation in a 3-agent chain. Agent 1 gets a date wrong by one year. Trace how this error compounds through Agents 2 and 3.

Q45. Calculation: A single agent has a 77% success rate over 5 steps (0.95^5). Calculate the overall success rate for a 3-agent system where each agent takes 5 steps and there are 3 handoffs between them (each handoff has a 95% success rate).

Q46. Name three state management challenges in multi-agent systems. For each, describe a concrete scenario where the problem occurs.

Q47. When is multi-agent justified? List the three legitimate use cases from 4.15.e and give an example for each.

Q48. A colleague proposes a 5-agent system: planner, researcher, calculator, writer, and reviewer. Argue that this should be a single agent. Then identify the ONE part that might benefit from a second agent (adversarial review).

Q49. Design: Design a legitimate 2-agent system for a legal compliance review. Agent 1: Legal analysis (needs legal database tools). Agent 2: Technical implementation review (needs code analysis tools). Describe the handoff protocol between them.

Q50. Explain why "always start with a single agent" is good advice. What evidence would convince you to add a second agent?

Answer Hints

Q	Hint
Q4	~25,000 tokens total. Context grows: 2K + 3.5K + 5K + 6.5K + 8K. Each call includes ALL prior messages.
Q9	P(at least one error) = 1 - 0.95^8 = 1 - 0.6634 = 33.7%. At 2%: 1 - 0.98^8 = 14.9%.
Q25	(a) Single call, (b) Agent, (c) Prompt chain or tool-augmented chain, (d) Full agent
Q29	No agent needed. Fixed 3-step sequence = prompt chain or hard-coded pipeline.
Q33	4 iterations * 5K tokens = 20K tokens/request. At ~$5/1M tokens avg: 20K * $5/1M = $0.10/request. 100K requests = $10,000/month.
Q42	(a) 3, (b) 10, (c) 28, (d) 45
Q45	0.774 * 0.774 * 0.774 * 0.857 = 0.398 (39.8% success)
Q36	Subscription check = DB query (code). Email sending = email API (code). Template content = maybe 1 LLM call. No agent needed.
Q37	FAQ = RAG pipeline (single retrieval + single LLM call). No loop needed.
Q39	(a) $6,000/month, (b) 208 hours processing time, (c) template matching + OCR + single LLM call for edge cases

<- Back to 4.15 -- Understanding AI Agents (README)