4.7 --- Function Calling / Tool Calling: Quick Revision
Compact cheat sheet. Print-friendly.
How to use this material (instructions)
- Skim before labs or interviews.
- Drill gaps --- reopen
README.md, then 4.7.a through 4.7.e.
- Practice ---
4.7-Exercise-Questions.md.
- Polish answers ---
4.7-Interview-Questions.md.
Core principle
LLMs generate text --- they CANNOT execute code, call APIs, or query databases.
Tool calling bridges this gap:
AI decides WHAT function to call + arguments (probabilistic reasoning)
Your code decides HOW to execute it (deterministic logic)
Core vocabulary
| Term | One-liner |
|---|
| Tool calling | API feature where the model returns a structured function name + arguments for your code to execute |
| Function calling | Original (deprecated) OpenAI term for the same concept; used the functions parameter |
| Tool use | Anthropic's (Claude) term for tool calling |
tools | API parameter --- array of tool definitions (JSON Schema) sent with each request |
tool_choice | Controls routing: 'auto' (model decides), 'none' (no tools), 'required' (must call), or { type: 'function', function: { name: '...' } } |
tool_calls | Array in the assistant's response containing function name, arguments, and unique call ID |
tool_call_id | Unique ID that links a tool result back to the tool call that requested it |
finish_reason | "tool_calls" when model wants a function executed; "stop" for text responses |
| Router pattern | Architecture where the LLM classifies intent and dispatches to the right handler function |
| Hybrid logic | AI handles reasoning (what) + code handles execution (how) --- the core production pattern |
The six-step tool calling flow
Step 1: DEFINE tools with JSON Schema (name, description, parameters)
Step 2: SEND messages + tools + tool_choice to LLM API
Step 3: MODEL returns tool_calls (finish_reason: "tool_calls")
OR text response (finish_reason: "stop")
Step 4: EXECUTE function in YOUR code (parse args, validate, call handler)
Step 5: RETURN result to model via { role: 'tool', tool_call_id, content }
Step 6: MODEL generates final natural-language response using the result
Tool definition structure
const tools = [
{
type: 'function',
function: {
name: 'improveBio',
description: 'Improve a dating profile bio',
parameters: {
type: 'object',
properties: {
currentBio: { type: 'string', description: 'Current bio text' },
tone: {
type: 'string',
enum: ['witty', 'sincere', 'adventurous'],
},
},
required: ['currentBio'],
additionalProperties: false,
},
},
},
];
Tool definition best practices
| Practice | Why |
|---|
Detailed description with trigger phrases | Model uses description to decide which tool to call |
Use enum for constrained values | Prevents model from inventing invalid values |
Mark required fields | Model knows which arguments it must provide |
Set additionalProperties: false | Prevents unexpected extra fields |
Add description to every property | Helps model extract the right data from user input |
| Keep 3--5 parameters per tool | More parameters = more chances for errors |
Message structure for tool calling
messages: [
{ role: 'system', content: '...' },
{ role: 'user', content: 'Improve my bio: "I like hiking"' },
]
{
role: 'assistant',
content: null,
tool_calls: [{
id: 'call_abc123',
type: 'function',
function: {
name: 'improveBio',
arguments: '{"currentBio":"I like hiking","tone":"witty"}'
}
}]
}
messages: [
{ role: 'system', content: '...' },
{ role: 'user', content: '...' },
{ role: 'assistant', content: null, tool_calls: [{ ... }] },
{
role: 'tool',
tool_call_id: 'call_abc123',
content: '{"improved":"..."}',
},
]
When to use tool calling vs JSON mode vs plain text
| Scenario | Approach | Why |
|---|
| "Write me a poem" | Plain text | Text IS the output |
| "Classify this email as spam" | Structured output (JSON mode) | LLM judgment is the result; no function needed |
| "What is my account balance?" | Tool calling | Requires database query |
| "Schedule a meeting at 2pm" | Tool calling | Requires creating an event (side effect) |
| "Calculate 15% tip on $47.83" | Tool calling | LLMs are unreliable at math |
| "Improve my dating bio" | Tool calling | Needs business rules, validation, logging |
| "Thanks for helping!" | Plain text | Conversational; no action needed |
Decision rule: If the task requires data retrieval, API calls, calculations, mutations, or enforcing business rules --- use tool calling. If the LLM's text output IS the result --- skip it.
Decision flowchart
Does the task require data NOT in the LLM's training?
YES --> Tool calling (data retrieval)
NO -->
Does the task require a SIDE EFFECT?
YES --> Tool calling (action tool)
NO -->
Does the task require PRECISE COMPUTATION?
YES --> Tool calling (calculation tool)
NO -->
Does the task require EXACT BUSINESS RULES?
YES --> Tool calling (business logic tool)
NO -->
Is the output STRUCTURED DATA for downstream code?
YES --> Structured output (JSON mode)
NO --> Plain text generation
Deterministic invocation --- key code
function safeParseArguments(argsString) {
try {
return { success: true, data: JSON.parse(argsString) };
} catch (error) {
return { success: false, error: error.message };
}
}
const functionMap = { improveBio, generateOpeners, moderateText };
const toolCall = message.tool_calls[0];
const parsed = safeParseArguments(toolCall.function.arguments);
if (!parsed.success) {
}
if (!functionMap[toolCall.function.name]) {
}
const result = await functionMap[toolCall.function.name](parsed.data);
Parallel tool calls
const toolResults = await Promise.all(
assistantMessage.tool_calls.map(async (tc) => {
const args = JSON.parse(tc.function.arguments);
const result = await functionMap[tc.function.name](args);
return {
role: 'tool',
tool_call_id: tc.id,
content: JSON.stringify(result),
};
})
);
Hybrid logic patterns
| Pattern | What happens | Best for |
|---|
| AI routes, code executes entirely | Function is pure deterministic logic (regex, DB query, math) | moderateText(), getAccountBalance(), calculateTip() |
| AI routes, code orchestrates AI | Function uses another LLM call with a specialized prompt, wrapped in guardrails | improveBio(), generateOpeners(), content generation |
| AI routes, code chains steps | Function runs a multi-step pipeline: validate, generate, filter, save, log | processProfileUpdate(), complex workflows |
The AI decision boundary
BEFORE the boundary (AI's domain):
- Understand natural language
- Classify intent (which function?)
- Extract arguments from text
- Handle ambiguity
AFTER the boundary (Code's domain):
- Validate input (length, format, required fields)
- Enforce business rules (character limits, banned words)
- Query databases (user status, premium tier)
- Execute computations (exact math)
- Call external APIs
- Log analytics, enforce rate limits, bill users
Tool router architecture
User Message
|
v
Input Validator ---- rejects empty/too-long messages
|
v
LLM Router --------- decides which tool(s) to call
|
v
Tool Handlers ------- execute function with validation
|
v
Result Logger ------- logs tool call + result
|
v
LLM Formatter ------- turns result into natural response
|
v
Final response to user
Error handling layers (production router)
Layer 1: Input Validation --- reject bad input before API call
Layer 2: API Errors --- try/catch, retry with backoff
Layer 3: Argument Parsing --- safeParseArguments(), return error as tool result
Layer 4: Unknown Functions --- check handlerMap, list available functions
Layer 5: Function Execution --- try/catch around handler, return error as tool result
Layer 6: Result Validation --- truncate oversized results (max ~4000 chars)
Layer 7: Final Response Fallback --- if formatting LLM fails, return raw results
Key rule: Return errors as tool role messages, not thrown exceptions. The model can then explain the problem naturally to the user.
Token cost of tool calling
Tool definition overhead:
~100-200 tokens per tool (3 params, description)
3 tools = ~450 tokens added to EVERY API call
Cost per interaction (improveBio example, 3 LLM calls):
Call 1: Routing ~$0.0025
Call 2: Generation ~$0.0015
Call 3: Formatting ~$0.004
Total: ~$0.008
At 100,000 interactions/day: ~$800/day
Reduce costs by:
- Use cheaper model (gpt-4o-mini) for routing
- Include only relevant tools per request
- Use
tool_choice: 'required' when intent is obvious
- Cache results for identical inputs
- Keep tool descriptions concise
tool_choice quick reference
| Value | Behavior | Use when |
|---|
'auto' | Model decides | General-purpose assistant (default) |
'none' | No tools called | You want text-only despite tools being present |
'required' | Must call at least one tool | You know a tool is needed |
{ type: 'function', function: { name: '...' } } | Must call this specific tool | UI context makes intent unambiguous |
Production patterns
function getToolsForUser(user) {
const base = [moderateTextTool, getProfileTipsTool];
if (user.isPremium) {
base.push(improveBioTool, generateOpenersTool);
}
return base;
}
function checkRateLimit(userId, toolName, limit = 10, windowMs = 60000) {
const key = `${userId}:${toolName}`;
const recent = calls.get(key)?.filter((ts) => Date.now() - ts < windowMs) || [];
if (recent.length >= limit) return { allowed: false };
recent.push(Date.now());
calls.set(key, recent);
return { allowed: true };
}
Common gotchas
| Gotcha | Why |
|---|
arguments is a JSON string, not an object | Must JSON.parse() before use; can be malformed |
Mismatched tool_call_id | Tool result must reference the exact id from the assistant's tool call |
content in tool result must be a string | JSON.stringify() objects before returning |
| Tool definitions consume tokens on EVERY call | Include only tools needed for the current context |
| LLM character counts are unreliable | Enforce limits in code (bio.slice(0, 500)), not in prompts |
| Too many tools degrade routing accuracy | 5--10 well-scoped tools outperform 30+ narrow ones |
| AI might hallucinate function names | Always check handlerMap[fnName] before executing |
functions param is deprecated | Use tools param (current standard) |
| Putting all business rules in the system prompt | Rules are probabilistic in prompts; deterministic in code |
| Tool that just calls the LLM again (no-op wrapper) | Anti-pattern: two LLM calls for zero benefit |
Testing routing accuracy
const testCases = [
{ input: 'Make my bio better: "I like coffee"', expectedTool: 'improveBio' },
{ input: 'Help me message a rock climber', expectedTool: 'generateOpeners' },
{ input: 'Is "Venmo me @john" safe?', expectedTool: 'moderateText' },
{ input: 'Thanks!', expectedTool: null },
];
Anti-patterns
| Anti-pattern | Problem | Fix |
|---|
| Wrapping pure LLM tasks as tools | generatePoem() tool just calls the LLM again --- double cost, zero benefit | Let the LLM generate text directly |
| Too many tools (30+) | Overwhelms the model; degrades routing accuracy; increases token cost | Consolidate into fewer, more capable tools |
| No-op tools | thinkAboutResponse() tool that does nothing --- wasted latency | Remove; the model can reason without a tool |
| One field per tool | getName(), getEmail(), getPhone() --- 50 tools for one entity | Consolidate: getUserProfile(userId, fields[]) |
End of 4.7 quick revision.