Episode 4 — Generative AI Engineering / 4.7 — Function Calling Tool Calling

4.7 --- Function Calling / Tool Calling: Quick Revision

Compact cheat sheet. Print-friendly.

How to use this material (instructions)

  1. Skim before labs or interviews.
  2. Drill gaps --- reopen README.md, then 4.7.a through 4.7.e.
  3. Practice --- 4.7-Exercise-Questions.md.
  4. Polish answers --- 4.7-Interview-Questions.md.

Core principle

LLMs generate text --- they CANNOT execute code, call APIs, or query databases.
Tool calling bridges this gap:
  AI decides WHAT function to call + arguments  (probabilistic reasoning)
  Your code decides HOW to execute it           (deterministic logic)

Core vocabulary

TermOne-liner
Tool callingAPI feature where the model returns a structured function name + arguments for your code to execute
Function callingOriginal (deprecated) OpenAI term for the same concept; used the functions parameter
Tool useAnthropic's (Claude) term for tool calling
toolsAPI parameter --- array of tool definitions (JSON Schema) sent with each request
tool_choiceControls routing: 'auto' (model decides), 'none' (no tools), 'required' (must call), or { type: 'function', function: { name: '...' } }
tool_callsArray in the assistant's response containing function name, arguments, and unique call ID
tool_call_idUnique ID that links a tool result back to the tool call that requested it
finish_reason"tool_calls" when model wants a function executed; "stop" for text responses
Router patternArchitecture where the LLM classifies intent and dispatches to the right handler function
Hybrid logicAI handles reasoning (what) + code handles execution (how) --- the core production pattern

The six-step tool calling flow

Step 1: DEFINE    tools with JSON Schema (name, description, parameters)
Step 2: SEND      messages + tools + tool_choice to LLM API
Step 3: MODEL     returns tool_calls (finish_reason: "tool_calls")
                  OR text response (finish_reason: "stop")
Step 4: EXECUTE   function in YOUR code (parse args, validate, call handler)
Step 5: RETURN    result to model via { role: 'tool', tool_call_id, content }
Step 6: MODEL     generates final natural-language response using the result

Tool definition structure

const tools = [
  {
    type: 'function',
    function: {
      name: 'improveBio',                          // Function name
      description: 'Improve a dating profile bio',  // Model reads this to decide
      parameters: {
        type: 'object',
        properties: {
          currentBio: { type: 'string', description: 'Current bio text' },
          tone: {
            type: 'string',
            enum: ['witty', 'sincere', 'adventurous'],  // Constrained values
          },
        },
        required: ['currentBio'],       // Required arguments
        additionalProperties: false,    // No extra fields
      },
    },
  },
];

Tool definition best practices

PracticeWhy
Detailed description with trigger phrasesModel uses description to decide which tool to call
Use enum for constrained valuesPrevents model from inventing invalid values
Mark required fieldsModel knows which arguments it must provide
Set additionalProperties: falsePrevents unexpected extra fields
Add description to every propertyHelps model extract the right data from user input
Keep 3--5 parameters per toolMore parameters = more chances for errors

Message structure for tool calling

// First API call (routing)
messages: [
  { role: 'system', content: '...' },
  { role: 'user', content: 'Improve my bio: "I like hiking"' },
]

// Model response (tool call)
{
  role: 'assistant',
  content: null,                    // null when calling tools
  tool_calls: [{
    id: 'call_abc123',
    type: 'function',
    function: {
      name: 'improveBio',
      arguments: '{"currentBio":"I like hiking","tone":"witty"}' // JSON STRING
    }
  }]
}

// Second API call (with tool result)
messages: [
  { role: 'system', content: '...' },
  { role: 'user', content: '...' },
  { role: 'assistant', content: null, tool_calls: [{ ... }] },
  {
    role: 'tool',                      // Special role
    tool_call_id: 'call_abc123',       // MUST match
    content: '{"improved":"..."}',     // MUST be a string
  },
]

When to use tool calling vs JSON mode vs plain text

ScenarioApproachWhy
"Write me a poem"Plain textText IS the output
"Classify this email as spam"Structured output (JSON mode)LLM judgment is the result; no function needed
"What is my account balance?"Tool callingRequires database query
"Schedule a meeting at 2pm"Tool callingRequires creating an event (side effect)
"Calculate 15% tip on $47.83"Tool callingLLMs are unreliable at math
"Improve my dating bio"Tool callingNeeds business rules, validation, logging
"Thanks for helping!"Plain textConversational; no action needed

Decision rule: If the task requires data retrieval, API calls, calculations, mutations, or enforcing business rules --- use tool calling. If the LLM's text output IS the result --- skip it.


Decision flowchart

Does the task require data NOT in the LLM's training?
  YES --> Tool calling (data retrieval)
  NO  -->
Does the task require a SIDE EFFECT?
  YES --> Tool calling (action tool)
  NO  -->
Does the task require PRECISE COMPUTATION?
  YES --> Tool calling (calculation tool)
  NO  -->
Does the task require EXACT BUSINESS RULES?
  YES --> Tool calling (business logic tool)
  NO  -->
Is the output STRUCTURED DATA for downstream code?
  YES --> Structured output (JSON mode)
  NO  --> Plain text generation

Deterministic invocation --- key code

// Parse arguments safely (ALWAYS wrap in try/catch)
function safeParseArguments(argsString) {
  try {
    return { success: true, data: JSON.parse(argsString) };
  } catch (error) {
    return { success: false, error: error.message };
  }
}

// Dispatch and execute
const functionMap = { improveBio, generateOpeners, moderateText };

const toolCall = message.tool_calls[0];
const parsed = safeParseArguments(toolCall.function.arguments);

if (!parsed.success) {
  // Return error as tool result
}
if (!functionMap[toolCall.function.name]) {
  // Return unknown-function error as tool result
}
const result = await functionMap[toolCall.function.name](parsed.data);

Parallel tool calls

// Model can return multiple tool_calls in one response
// Execute all in parallel, return ALL results

const toolResults = await Promise.all(
  assistantMessage.tool_calls.map(async (tc) => {
    const args = JSON.parse(tc.function.arguments);
    const result = await functionMap[tc.function.name](args);
    return {
      role: 'tool',
      tool_call_id: tc.id,          // Each result must match its call
      content: JSON.stringify(result),
    };
  })
);

// Disable parallel calls: parallel_tool_calls: false

Hybrid logic patterns

PatternWhat happensBest for
AI routes, code executes entirelyFunction is pure deterministic logic (regex, DB query, math)moderateText(), getAccountBalance(), calculateTip()
AI routes, code orchestrates AIFunction uses another LLM call with a specialized prompt, wrapped in guardrailsimproveBio(), generateOpeners(), content generation
AI routes, code chains stepsFunction runs a multi-step pipeline: validate, generate, filter, save, logprocessProfileUpdate(), complex workflows

The AI decision boundary

BEFORE the boundary (AI's domain):
  - Understand natural language
  - Classify intent (which function?)
  - Extract arguments from text
  - Handle ambiguity

AFTER the boundary (Code's domain):
  - Validate input (length, format, required fields)
  - Enforce business rules (character limits, banned words)
  - Query databases (user status, premium tier)
  - Execute computations (exact math)
  - Call external APIs
  - Log analytics, enforce rate limits, bill users

Tool router architecture

User Message
     |
     v
Input Validator ---- rejects empty/too-long messages
     |
     v
LLM Router --------- decides which tool(s) to call
     |
     v
Tool Handlers ------- execute function with validation
     |
     v
Result Logger ------- logs tool call + result
     |
     v
LLM Formatter ------- turns result into natural response
     |
     v
Final response to user

Error handling layers (production router)

Layer 1: Input Validation       --- reject bad input before API call
Layer 2: API Errors             --- try/catch, retry with backoff
Layer 3: Argument Parsing       --- safeParseArguments(), return error as tool result
Layer 4: Unknown Functions      --- check handlerMap, list available functions
Layer 5: Function Execution     --- try/catch around handler, return error as tool result
Layer 6: Result Validation      --- truncate oversized results (max ~4000 chars)
Layer 7: Final Response Fallback --- if formatting LLM fails, return raw results

Key rule: Return errors as tool role messages, not thrown exceptions. The model can then explain the problem naturally to the user.


Token cost of tool calling

Tool definition overhead:
  ~100-200 tokens per tool (3 params, description)
  3 tools = ~450 tokens added to EVERY API call

Cost per interaction (improveBio example, 3 LLM calls):
  Call 1: Routing       ~$0.0025
  Call 2: Generation    ~$0.0015
  Call 3: Formatting    ~$0.004
  Total:                ~$0.008

At 100,000 interactions/day: ~$800/day

Reduce costs by:

  • Use cheaper model (gpt-4o-mini) for routing
  • Include only relevant tools per request
  • Use tool_choice: 'required' when intent is obvious
  • Cache results for identical inputs
  • Keep tool descriptions concise

tool_choice quick reference

ValueBehaviorUse when
'auto'Model decidesGeneral-purpose assistant (default)
'none'No tools calledYou want text-only despite tools being present
'required'Must call at least one toolYou know a tool is needed
{ type: 'function', function: { name: '...' } }Must call this specific toolUI context makes intent unambiguous

Production patterns

// Context-aware tool selection (premium vs free)
function getToolsForUser(user) {
  const base = [moderateTextTool, getProfileTipsTool];
  if (user.isPremium) {
    base.push(improveBioTool, generateOpenersTool);
  }
  return base;
}

// Rate limiting (per-user, per-tool)
function checkRateLimit(userId, toolName, limit = 10, windowMs = 60000) {
  const key = `${userId}:${toolName}`;
  const recent = calls.get(key)?.filter((ts) => Date.now() - ts < windowMs) || [];
  if (recent.length >= limit) return { allowed: false };
  recent.push(Date.now());
  calls.set(key, recent);
  return { allowed: true };
}

Common gotchas

GotchaWhy
arguments is a JSON string, not an objectMust JSON.parse() before use; can be malformed
Mismatched tool_call_idTool result must reference the exact id from the assistant's tool call
content in tool result must be a stringJSON.stringify() objects before returning
Tool definitions consume tokens on EVERY callInclude only tools needed for the current context
LLM character counts are unreliableEnforce limits in code (bio.slice(0, 500)), not in prompts
Too many tools degrade routing accuracy5--10 well-scoped tools outperform 30+ narrow ones
AI might hallucinate function namesAlways check handlerMap[fnName] before executing
functions param is deprecatedUse tools param (current standard)
Putting all business rules in the system promptRules are probabilistic in prompts; deterministic in code
Tool that just calls the LLM again (no-op wrapper)Anti-pattern: two LLM calls for zero benefit

Testing routing accuracy

const testCases = [
  { input: 'Make my bio better: "I like coffee"', expectedTool: 'improveBio' },
  { input: 'Help me message a rock climber', expectedTool: 'generateOpeners' },
  { input: 'Is "Venmo me @john" safe?', expectedTool: 'moderateText' },
  { input: 'Thanks!', expectedTool: null },
];

// Run at temperature: 0 for deterministic results
// Test routing accuracy SEPARATELY from handler correctness

Anti-patterns

Anti-patternProblemFix
Wrapping pure LLM tasks as toolsgeneratePoem() tool just calls the LLM again --- double cost, zero benefitLet the LLM generate text directly
Too many tools (30+)Overwhelms the model; degrades routing accuracy; increases token costConsolidate into fewer, more capable tools
No-op toolsthinkAboutResponse() tool that does nothing --- wasted latencyRemove; the model can reason without a tool
One field per toolgetName(), getEmail(), getPhone() --- 50 tools for one entityConsolidate: getUserProfile(userId, fields[])

End of 4.7 quick revision.