Episode 4 — Generative AI Engineering / 4.7 — Function Calling Tool Calling

4.7.a --- What Is Tool Calling

In one sentence: Tool calling (also called function calling) is a structured mechanism where the LLM decides which function to invoke and what arguments to pass, returning a machine-readable instruction that your code executes --- bridging the gap between AI reasoning and deterministic code execution.

Navigation: <- 4.7 Overview | 4.7.b --- When to Use Tool Calling ->

1. The Fundamental Problem

LLMs are text generators. They can reason, plan, and understand intent --- but they cannot execute code, call APIs, query databases, or perform calculations. They exist entirely in the world of text.

What LLMs CAN do:                    What LLMs CANNOT do:
  - Understand user intent              - Execute JavaScript functions
  - Reason about which action fits      - Query a database
  - Generate structured arguments       - Call an external API
  - Format results as natural text      - Perform precise math
  - Classify and route requests         - Read/write files
                                        - Access real-time data

This creates a gap: the AI can figure out what needs to happen, but it cannot make it happen. Tool calling bridges this gap.

2. What Is Tool Calling?

Tool calling (also known as function calling) is a feature built into modern LLM APIs that allows the model to:

Receive a list of available tools (functions) with their descriptions and parameter schemas.
Analyze the user's message to determine if a tool should be called.
Return a structured tool call with the function name and arguments --- instead of (or alongside) a text response.
Receive the tool's execution result and use it to generate a final response.

The critical distinction: the model does not execute the function. It returns a structured instruction that tells your code which function to call and with what arguments. Your code then executes the function and feeds the result back.

+------------------------------------------------------------------------+
|                     TOOL CALLING ARCHITECTURE                           |
|                                                                         |
|  +-----------+     +----------+     +-----------+     +----------+      |
|  |   User    |     |   Your   |     |   LLM     |     |   Your   |      |
|  |  Message  | --> |   Code   | --> |   API     | --> |   Code   |      |
|  +-----------+     +----+-----+     +-----+-----+     +----+-----+      |
|                         |                 |                 |            |
|                         |  Sends message  |                 |            |
|                         |  + tool defs    |                 |            |
|                         |                 |                 |            |
|                         |    Returns      |                 |            |
|                         |  <-- tool_call  |                 |            |
|                         |    (name+args)  |                 |            |
|                         |                 |                 |            |
|                         |  Executes function locally        |            |
|                         |  -------------------------------->|            |
|                         |                                   |            |
|                         |  Sends result   |                 |            |
|                         |  back to LLM -->|                 |            |
|                         |                 |                 |            |
|                         |    Returns      |                 |            |
|                         |  <-- final text |                 |            |
|                         |    for user     |                 |            |
+------------------------------------------------------------------------+

Key: The LLM NEVER executes the function.
     It only DECIDES which function to call and generates the arguments.
     YOUR CODE does the actual execution.

3. A Concrete Example

Imagine you are building a dating app assistant. The user can ask for different things:

"Make my bio better" -> should call improveBio()
"Give me some conversation starters" -> should call generateOpeners()
"Is this message appropriate?" -> should call moderateText()

Without tool calling, you would need complex prompt engineering or regex parsing to figure out user intent and extract arguments. With tool calling, the model handles this naturally.

import OpenAI from 'openai';

const openai = new OpenAI();

// Step 1: Define the tools (functions the model can "call")
const tools = [
  {
    type: 'function',
    function: {
      name: 'improveBio',
      description: 'Improve a user dating profile bio to be more engaging and authentic',
      parameters: {
        type: 'object',
        properties: {
          currentBio: {
            type: 'string',
            description: 'The user\'s current bio text',
          },
          tone: {
            type: 'string',
            enum: ['witty', 'sincere', 'adventurous', 'intellectual'],
            description: 'Desired tone for the improved bio',
          },
        },
        required: ['currentBio'],
      },
    },
  },
  {
    type: 'function',
    function: {
      name: 'generateOpeners',
      description: 'Generate conversation opener messages based on a profile',
      parameters: {
        type: 'object',
        properties: {
          profileDescription: {
            type: 'string',
            description: 'Description of the person\'s profile to generate openers for',
          },
          count: {
            type: 'number',
            description: 'Number of openers to generate (default 3)',
          },
        },
        required: ['profileDescription'],
      },
    },
  },
  {
    type: 'function',
    function: {
      name: 'moderateText',
      description: 'Check if a message is appropriate for a dating platform',
      parameters: {
        type: 'object',
        properties: {
          text: {
            type: 'string',
            description: 'The message text to moderate',
          },
        },
        required: ['text'],
      },
    },
  },
];

// Step 2: Send user message + tools to the API
const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [
    {
      role: 'system',
      content: 'You are a dating app assistant. Use the available tools to help users.',
    },
    {
      role: 'user',
      content: 'Can you make my bio sound better? Here it is: "I like hiking and coffee"',
    },
  ],
  tools,          // <-- The model sees all available functions
  tool_choice: 'auto',  // <-- Let the model decide whether to call a tool
});

// Step 3: The model returns a tool call (NOT a text response)
const message = response.choices[0].message;
console.log(message.tool_calls);
// [
//   {
//     id: 'call_abc123',
//     type: 'function',
//     function: {
//       name: 'improveBio',               <-- Model chose THIS function
//       arguments: '{"currentBio":"I like hiking and coffee","tone":"witty"}'
//     }
//   }
// ]

The model analyzed the user's message, determined that improveBio was the right function, extracted the bio text as an argument, and even inferred a tone. It did not execute the function --- it told your code what to execute.

4. How Tool Calling Differs from "Just Ask for JSON"

A common question: why not just prompt the model to return JSON? You could write:

Return a JSON object with "function" and "args" fields indicating which function to call.

Tool calling is fundamentally different and superior for several reasons:

Comparison table

Aspect	Prompt for JSON	Tool Calling (API Feature)
Reliability	Model might wrap JSON in text, forget fields, use wrong types	API guarantees valid function name and argument structure
Schema enforcement	You describe the schema in natural language and hope	JSON Schema is enforced by the API; model is constrained
Training	Model not specifically trained to output your format	Model is specifically fine-tuned for tool call format
Multi-tool	Awkward to describe multiple functions in a prompt	Native support for arrays of tools
Parallel calls	Very difficult to get right with prompting	Built-in support for multiple simultaneous tool calls
Validation	You must parse, validate, and handle failures yourself	API returns structured `tool_calls` objects
Finish reason	Always `stop` --- you parse the text to check	`finish_reason: 'tool_calls'` tells you explicitly
Conversation flow	Manual: you fake the multi-turn flow	Native `tool` role messages for returning results

The JSON prompt approach (fragile)

// BAD: Prompting for JSON to simulate tool calling
const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [
    {
      role: 'system',
      content: `You have these functions available:
        1. improveBio(currentBio, tone) - improves a dating bio
        2. generateOpeners(profileDescription, count) - creates openers
        3. moderateText(text) - checks if text is appropriate
        
        When the user needs one, return ONLY a JSON object like:
        {"function": "improveBio", "args": {"currentBio": "...", "tone": "..."}}
        
        Do not include any other text.`,
    },
    { role: 'user', content: 'Make my bio better: "I like dogs"' },
  ],
});

const text = response.choices[0].message.content;
// Might return: {"function": "improveBio", "args": {"currentBio": "I like dogs"}}
// Might return: Sure! {"function": "improveBio", ...}  <-- wrapped in text
// Might return: {"function": "improve_bio", ...}       <-- wrong name
// Might return: {"func": "improveBio", ...}            <-- wrong field name

The tool calling approach (reliable)

// GOOD: Using the tools parameter
const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [
    {
      role: 'system',
      content: 'You are a dating app assistant. Use the available tools to help users.',
    },
    { role: 'user', content: 'Make my bio better: "I like dogs"' },
  ],
  tools,  // <-- Defined with JSON schemas
});

// response.choices[0].message.tool_calls is ALWAYS a structured array
// The function name ALWAYS matches one of your defined tools
// The arguments ALWAYS follow your JSON schema
// No text wrapping, no wrong field names, no guessing

5. The Mental Model: AI as a Router

The best way to think about tool calling is: the LLM is a smart router. It takes unstructured human language and routes it to the right structured function.

+------------------------------------------------------------------------+
|                    AI AS A SMART ROUTER                                  |
|                                                                         |
|  User says:                       AI routes to:                         |
|                                                                         |
|  "Fix my bio, it's boring"   ---> improveBio(bio, tone)                 |
|  "Help me start a convo      ---> generateOpeners(profile, count)       |
|   with someone who likes                                                |
|   rock climbing"                                                        |
|  "Is this message okay       ---> moderateText(text)                    |
|   to send?"                                                             |
|  "What's the weather?"       ---> No tool call (outside scope,          |
|                                   responds with text)                   |
|  "Thanks!"                   ---> No tool call (just conversation,      |
|                                   responds with text)                   |
|                                                                         |
|  The AI understands:                                                    |
|    - WHICH function matches the user's intent                           |
|    - WHAT arguments to extract from the user's message                  |
|    - WHEN no function is needed (pure conversation)                     |
+------------------------------------------------------------------------+

This router analogy is powerful because it clarifies the separation of concerns:

AI's job: Understand intent, choose function, extract arguments.
Your code's job: Execute the function, handle errors, enforce business rules.

6. Terminology: Function Calling vs Tool Calling

You will see both terms used, sometimes interchangeably. Here is the distinction:

Term	Meaning	Status
Function calling	The original OpenAI term (June 2023). Used the `functions` parameter.	Deprecated in newer API versions
Tool calling	The current standard term. Uses the `tools` parameter. Tools can be functions, but the abstraction allows for other types in the future.	Current standard
Tool use	Anthropic's (Claude) term for the same concept.	Anthropic ecosystem

In practice, "function calling" and "tool calling" mean the same thing. The industry is converging on "tool calling" as the standard term. Throughout this section, we use both interchangeably but prefer "tool calling" for new code.

// OLD (deprecated): functions parameter
const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages,
  functions: [{ name: 'getWeather', parameters: {...} }],  // Deprecated
  function_call: 'auto',                                     // Deprecated
});

// CURRENT: tools parameter
const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages,
  tools: [{ type: 'function', function: { name: 'getWeather', parameters: {...} } }],
  tool_choice: 'auto',  // Current standard
});

7. Which Models Support Tool Calling?

Not all models support tool calling. Here is the current landscape:

Provider	Models	Tool Calling Support
OpenAI	GPT-4o, GPT-4o-mini, GPT-4-turbo, GPT-3.5-turbo (newer)	Full support via `tools` parameter
Anthropic	Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Haiku, Claude 4	Full support via `tools` parameter (called "tool use")
Google	Gemini 1.5 Pro, Gemini 1.5 Flash	Full support via `function_declarations`
Meta	Llama 3.1+ (via providers)	Supported through hosting platforms
Mistral	Mistral Large, Mistral Medium	Supported via `tools` parameter
Open source	Varies	Some models fine-tuned for tool calling; not universal

The examples in this section use the OpenAI API format, which is the most widely adopted. The concepts translate directly to other providers.

8. The Tool Calling Lifecycle (Overview)

Here is the full lifecycle that we will explore in detail in sections 4.7.c through 4.7.e:

+------------------------------------------------------------------------+
|                    COMPLETE TOOL CALLING LIFECYCLE                       |
|                                                                         |
|  PHASE 1: SETUP (done once)                                            |
|  +------------------------------------------------------------------+  |
|  | Define tools with JSON schemas                                    | |
|  | Each tool has: name, description, parameters (JSON Schema)        | |
|  +------------------------------------------------------------------+  |
|                                                                         |
|  PHASE 2: REQUEST (every user message)                                  |
|  +------------------------------------------------------------------+  |
|  | Send to LLM API:                                                  | |
|  |   - messages (system + user + history)                            | |
|  |   - tools (array of tool definitions)                             | |
|  |   - tool_choice ('auto' | 'required' | 'none' | specific)        | |
|  +------------------------------------------------------------------+  |
|                                                                         |
|  PHASE 3: MODEL DECISION                                                |
|  +------------------------------------------------------------------+  |
|  | Model analyzes the message and decides:                           | |
|  |   A) No tool needed -> returns text response (finish_reason:stop) | |
|  |   B) Tool needed -> returns tool_calls (finish_reason:tool_calls) | |
|  |   C) Multiple tools -> returns parallel tool_calls                | |
|  +------------------------------------------------------------------+  |
|                                                                         |
|  PHASE 4: EXECUTION (if tool was called)                                |
|  +------------------------------------------------------------------+  |
|  | Your code:                                                        | |
|  |   1. Reads the function name and arguments                        | |
|  |   2. Validates the arguments                                      | |
|  |   3. Executes the actual function                                 | |
|  |   4. Handles any errors                                           | |
|  +------------------------------------------------------------------+  |
|                                                                         |
|  PHASE 5: RESULT RETURN                                                 |
|  +------------------------------------------------------------------+  |
|  | Send tool result back to the LLM:                                 | |
|  |   - role: 'tool'                                                  | |
|  |   - tool_call_id: matches the original call                       | |
|  |   - content: the function's result (as string)                    | |
|  +------------------------------------------------------------------+  |
|                                                                         |
|  PHASE 6: FINAL RESPONSE                                                |
|  +------------------------------------------------------------------+  |
|  | Model uses the tool result to generate a natural-language         | |
|  | response for the user.                                            | |
|  +------------------------------------------------------------------+  |
+------------------------------------------------------------------------+

9. Why Tool Calling Is a Game-Changer

Tool calling transforms what you can build with LLMs:

Before tool calling

User: "What's my account balance?"
AI: "I don't have access to your account information. 
     Please check your banking app."

After tool calling

User: "What's my account balance?"
AI decides: call getAccountBalance(userId: "user_123")
Code executes: queries database, returns $4,523.67
AI responds: "Your current account balance is $4,523.67."

The transformation

Before Tool Calling	After Tool Calling
AI can only talk about things	AI can do things (via your code)
Limited to information in training data	Can access real-time data
Can only suggest actions	Can execute actions
Pure text in, pure text out	Text in, structured action + text out
Chatbot	Agent

This is the bridge from chatbots to AI agents, and it starts here.

10. Key Takeaways

Tool calling lets the LLM decide which function to call and what arguments to pass --- but YOUR code executes the function.
It is fundamentally different from prompting for JSON --- it uses API-level schema enforcement, fine-tuned model behavior, and structured response objects.
Think of the LLM as a smart router: it takes natural language and maps it to structured function calls.
The lifecycle is: define tools -> send to API -> model returns tool_calls -> execute -> return result -> model generates final response.
Tool calling is the foundation for AI agents --- it transforms LLMs from text generators into systems that can take actions.

Explain-It Challenge

A colleague says "I can just prompt the model to return JSON with the function name --- why do I need tool calling?" Explain three specific advantages of the API-level tool calling feature.
Draw the tool calling lifecycle for this scenario: a user says "Check if this message is appropriate: Hey gorgeous, want to meet up?" and your system has a moderateText() function available.
Why does the LLM not execute the function itself? What would go wrong if it could?

Navigation: <- 4.7 Overview | 4.7.b --- When to Use Tool Calling ->