Episode 4 — Generative AI Engineering / 4.17 — LangChain Practical
4.17.c — Tools and Memory
In one sentence: LangChain tools give your LLM the ability to take actions in the real world (search the web, query databases, call APIs), while memory modules let your chains remember previous interactions so conversations feel continuous rather than starting fresh every turn.
Navigation: <- 4.17.b Chains and Prompt Templates | 4.17.d — Working with Agents ->
1. LangChain Tools: Built-in and Custom
A tool in LangChain is a function that an LLM can decide to call. Each tool has a name, a description (which the LLM reads to decide when to use it), and an implementation (the actual function that runs).
Why tools matter
Without tools, an LLM can only generate text based on its training data. With tools, it can:
- Search the web for current information
- Query a database for real-time data
- Perform calculations with guaranteed accuracy
- Call external APIs (send emails, create tickets, update records)
- Read and write files
Built-in tools
LangChain provides ready-made tools for common tasks:
import { TavilySearchResults } from '@langchain/community/tools/tavily_search';
import { Calculator } from '@langchain/community/tools/calculator';
import { WikipediaQueryRun } from '@langchain/community/tools/wikipedia_query_run';
// Web search tool
const searchTool = new TavilySearchResults({
maxResults: 3,
apiKey: process.env.TAVILY_API_KEY
});
// Calculator tool
const calculatorTool = new Calculator();
// Wikipedia tool
const wikiTool = new WikipediaQueryRun({
topKResults: 1,
maxDocContentLength: 1000
});
// Each tool has a name, description, and can be invoked
console.log(searchTool.name); // "tavily_search_results_json"
console.log(searchTool.description); // "A search engine. Useful for..."
const searchResults = await searchTool.invoke('LangChain latest version 2025');
console.log(searchResults);
The tool interface
Every LangChain tool follows this interface:
Tool:
name: string — Unique identifier (the LLM uses this to call the tool)
description: string — Natural language description (the LLM reads this to decide WHEN to use it)
schema: ZodSchema — Input schema (what arguments the tool accepts)
invoke(input): output — The actual function that runs
The description is critical — it is the only thing the LLM has to decide which tool to use. A vague description leads to wrong tool selection. A precise description leads to correct tool selection.
2. Creating Custom Tools
Using DynamicTool (simplest approach)
import { DynamicTool } from '@langchain/community/tools/dynamic';
const weatherTool = new DynamicTool({
name: 'get_weather',
description: 'Get the current weather for a given city. Input should be a city name like "London" or "New York".',
func: async (cityName) => {
// In production, call a real weather API
const response = await fetch(
`https://api.weatherapi.com/v1/current.json?key=${process.env.WEATHER_API_KEY}&q=${cityName}`
);
const data = await response.json();
return JSON.stringify({
city: data.location.name,
temperature: data.current.temp_c,
condition: data.current.condition.text
});
}
});
// Test the tool directly
const weather = await weatherTool.invoke('London');
console.log(weather);
// {"city":"London","temperature":15,"condition":"Partly cloudy"}
Using DynamicStructuredTool (with typed input)
import { DynamicStructuredTool } from '@langchain/community/tools/dynamic';
import { z } from 'zod';
const databaseQueryTool = new DynamicStructuredTool({
name: 'query_database',
description: 'Query the product database. Use this to look up product information by name or category.',
schema: z.object({
query: z.string().describe('The search query for products'),
category: z.string().optional().describe('Optional category filter: electronics, clothing, food'),
limit: z.number().default(5).describe('Maximum number of results to return')
}),
func: async ({ query, category, limit }) => {
// In production, query your actual database
const results = await db.products.search({
text: query,
category: category,
limit: limit
});
return JSON.stringify(results);
}
});
Using the tool() function (modern approach)
import { tool } from '@langchain/core/tools';
import { z } from 'zod';
const calculateDiscount = tool(
async ({ originalPrice, discountPercent }) => {
const discount = originalPrice * (discountPercent / 100);
const finalPrice = originalPrice - discount;
return JSON.stringify({
originalPrice,
discountPercent,
discountAmount: discount.toFixed(2),
finalPrice: finalPrice.toFixed(2)
});
},
{
name: 'calculate_discount',
description: 'Calculate the final price after applying a percentage discount.',
schema: z.object({
originalPrice: z.number().describe('The original price in dollars'),
discountPercent: z.number().min(0).max(100).describe('The discount percentage (0-100)')
})
}
);
const result = await calculateDiscount.invoke({
originalPrice: 99.99,
discountPercent: 20
});
console.log(result);
// {"originalPrice":99.99,"discountPercent":20,"discountAmount":"20.00","finalPrice":"79.99"}
Tool design best practices
| Practice | Why |
|---|---|
| Descriptive names | search_product_catalog > search — the LLM needs context |
| Detailed descriptions | Include what the tool does, when to use it, and what input format is expected |
| Structured schemas | Use Zod schemas with .describe() on each field — the LLM sees these descriptions |
| Return strings | Tools should return serialized strings (JSON) — LLMs process text |
| Handle errors gracefully | Return error messages as strings rather than throwing — the agent can recover from a string error |
| Keep tools focused | One tool per action — don't make a tool that does 5 different things |
// BAD: Vague description
const tool1 = new DynamicTool({
name: 'api',
description: 'Calls an API',
func: async (input) => { /* ... */ }
});
// GOOD: Precise description
const tool2 = new DynamicTool({
name: 'get_order_status',
description: 'Look up the current status of a customer order. Input must be an order ID like "ORD-12345". Returns the order status (pending, shipped, delivered, cancelled) and estimated delivery date.',
func: async (orderId) => {
try {
const order = await orderService.getStatus(orderId);
return JSON.stringify(order);
} catch (error) {
return `Error: Could not find order ${orderId}. Please verify the order ID format (e.g., ORD-12345).`;
}
}
});
3. Memory Modules
LLMs are stateless — each API call is independent. Without memory, every conversation turn starts from scratch. LangChain's memory modules solve this by storing and injecting conversation history automatically.
The memory problem
Turn 1: User: "My name is Alex." → AI: "Nice to meet you, Alex!"
Turn 2: User: "What is my name?" → AI: "I don't know your name." (no memory!)
Why? Each API call is independent. The model doesn't "remember" turn 1.
The solution: explicitly pass conversation history with each request.
Memory modules automate this.
BufferMemory — store everything
The simplest memory: stores every message in the conversation verbatim.
import { BufferMemory } from 'langchain/memory';
import { ChatOpenAI } from '@langchain/openai';
import { ChatPromptTemplate, MessagesPlaceholder } from '@langchain/core/prompts';
import { StringOutputParser } from '@langchain/core/output_parsers';
import { RunnableSequence, RunnablePassthrough } from '@langchain/core/runnables';
const memory = new BufferMemory({
returnMessages: true, // Return as Message objects (not a string)
memoryKey: 'history' // Key used in the prompt template
});
const prompt = ChatPromptTemplate.fromMessages([
['system', 'You are a friendly assistant. Keep track of the conversation.'],
new MessagesPlaceholder('history'),
['user', '{input}']
]);
const model = new ChatOpenAI({ modelName: 'gpt-4o', temperature: 0.7 });
const parser = new StringOutputParser();
// Build a chain with memory
async function chat(input) {
// Load memory variables
const memoryVariables = await memory.loadMemoryVariables({});
// Run the chain
const chain = prompt.pipe(model).pipe(parser);
const response = await chain.invoke({
input: input,
history: memoryVariables.history
});
// Save the new exchange to memory
await memory.saveContext(
{ input: input },
{ output: response }
);
return response;
}
// Multi-turn conversation
console.log(await chat('My name is Alex and I work at Acme Corp.'));
// "Nice to meet you, Alex! How's everything at Acme Corp?"
console.log(await chat('What is my name?'));
// "Your name is Alex!"
console.log(await chat('Where do I work?'));
// "You work at Acme Corp!"
Tradeoff: BufferMemory stores every message. After 50 turns, the history consumes thousands of tokens. In long conversations, this will exceed the context window.
ConversationSummaryMemory — compress old history
Instead of storing every message, this memory keeps a running summary of the conversation. It uses an LLM to summarize older turns, keeping the history compact.
import { ConversationSummaryMemory } from 'langchain/memory';
import { ChatOpenAI } from '@langchain/openai';
const memory = new ConversationSummaryMemory({
llm: new ChatOpenAI({ modelName: 'gpt-4o-mini', temperature: 0 }),
returnMessages: true,
memoryKey: 'history'
});
// After many turns, instead of storing every message, memory contains:
// "The user's name is Alex. They work at Acme Corp as a frontend developer.
// They asked about React performance optimization. The assistant recommended
// useMemo and React.memo for expensive computations..."
// This might be 100 tokens instead of 5,000 tokens of raw conversation history.
Tradeoff: Summary memory costs an extra LLM call per turn (to update the summary) and loses exact details. The user said "I love React 19" but the summary might only say "The user likes React."
ConversationBufferWindowMemory — sliding window
Keeps only the last N exchanges. Simple and predictable.
import { ConversationBufferWindowMemory } from 'langchain/memory';
const memory = new ConversationBufferWindowMemory({
k: 5, // Keep last 5 exchanges (10 messages: 5 user + 5 assistant)
returnMessages: true,
memoryKey: 'history'
});
// Turn 1-5: all stored
// Turn 6: turn 1 is dropped, turns 2-6 stored
// Turn 7: turns 1-2 dropped, turns 3-7 stored
// etc.
Tradeoff: Hard cutoff — the model completely forgets anything beyond the window. No gradual degradation.
VectorStoreMemory — semantic retrieval of past conversations
Stores conversation turns in a vector database and retrieves the most relevant past exchanges for the current query. This is the most sophisticated approach.
import { VectorStoreRetrieverMemory } from 'langchain/memory';
import { MemoryVectorStore } from 'langchain/vectorstores/memory';
import { OpenAIEmbeddings } from '@langchain/openai';
const vectorStore = new MemoryVectorStore(new OpenAIEmbeddings());
const memory = new VectorStoreRetrieverMemory({
vectorStoreRetriever: vectorStore.asRetriever(3), // Retrieve top 3 relevant memories
memoryKey: 'history'
});
// Save some context
await memory.saveContext(
{ input: 'My favorite programming language is TypeScript' },
{ output: 'TypeScript is a great choice! The type system helps catch bugs early.' }
);
await memory.saveContext(
{ input: 'I had pizza for lunch' },
{ output: 'Pizza is always a solid choice!' }
);
await memory.saveContext(
{ input: 'I am building a React dashboard' },
{ output: 'React dashboards benefit from good state management...' }
);
// Query: "What language should I use for my project?"
// VectorStoreMemory retrieves the TypeScript and React exchanges
// (semantically relevant) but NOT the pizza exchange (irrelevant)
const relevantMemories = await memory.loadMemoryVariables({
input: 'What language should I use for my project?'
});
Tradeoff: Extra latency for embedding + retrieval per turn. May miss contextually important but semantically distant information.
Memory comparison table
| Memory Type | Token Usage | Recall Accuracy | Extra Cost | Best For |
|---|---|---|---|---|
| BufferMemory | Grows linearly | Perfect (everything stored) | None | Short conversations (< 20 turns) |
| BufferWindowMemory | Fixed (k * 2 messages) | Recent only | None | Chatbots with long sessions |
| ConversationSummaryMemory | Grows slowly | Approximate | 1 LLM call per turn | Long conversations needing gist |
| VectorStoreMemory | Fixed retrieval size | Semantic (may miss things) | Embedding per turn | Conversations spanning many topics |
4. Adding Memory to Chains
Here is a complete chatbot implementation with memory:
import { ChatOpenAI } from '@langchain/openai';
import { ChatPromptTemplate, MessagesPlaceholder } from '@langchain/core/prompts';
import { StringOutputParser } from '@langchain/core/output_parsers';
import { BufferMemory } from 'langchain/memory';
import { RunnableWithMessageHistory } from '@langchain/core/runnables';
import { ChatMessageHistory } from 'langchain/stores/message/in_memory';
// Store for multiple conversation sessions
const messageHistories = {};
function getMessageHistory(sessionId) {
if (!messageHistories[sessionId]) {
messageHistories[sessionId] = new ChatMessageHistory();
}
return messageHistories[sessionId];
}
const model = new ChatOpenAI({ modelName: 'gpt-4o', temperature: 0.7 });
const parser = new StringOutputParser();
const prompt = ChatPromptTemplate.fromMessages([
['system', 'You are a helpful assistant. You remember everything the user tells you.'],
new MessagesPlaceholder('history'),
['user', '{input}']
]);
const chain = prompt.pipe(model).pipe(parser);
// Wrap the chain with message history management
const chainWithHistory = new RunnableWithMessageHistory({
runnable: chain,
getMessageHistory: (sessionId) => getMessageHistory(sessionId),
inputMessagesKey: 'input',
historyMessagesKey: 'history'
});
// Conversation with session tracking
const config = { configurable: { sessionId: 'user-123' } };
const r1 = await chainWithHistory.invoke(
{ input: 'Hi! My name is Jordan and I am learning TypeScript.' },
config
);
console.log(r1);
// "Hello Jordan! That's great that you're learning TypeScript..."
const r2 = await chainWithHistory.invoke(
{ input: 'What am I learning?' },
config
);
console.log(r2);
// "You're learning TypeScript!"
// Different session — separate memory
const config2 = { configurable: { sessionId: 'user-456' } };
const r3 = await chainWithHistory.invoke(
{ input: 'What is my name?' },
config2
);
console.log(r3);
// "I don't know your name yet! What should I call you?"
5. Persistent Memory with Databases
In-memory storage is lost when the process restarts. For production, you need persistent storage.
Using Redis for message history
import { RedisChatMessageHistory } from '@langchain/redis';
function getMessageHistory(sessionId) {
return new RedisChatMessageHistory({
sessionId: sessionId,
url: process.env.REDIS_URL || 'redis://localhost:6379'
});
}
// Now conversations persist across server restarts
const chainWithHistory = new RunnableWithMessageHistory({
runnable: chain,
getMessageHistory: getMessageHistory,
inputMessagesKey: 'input',
historyMessagesKey: 'history'
});
Using a SQL database
// Conceptual example — store messages in PostgreSQL
import { BaseListChatMessageHistory } from '@langchain/core/chat_history';
import { HumanMessage, AIMessage } from '@langchain/core/messages';
class PostgresChatHistory extends BaseListChatMessageHistory {
constructor(sessionId, pool) {
super();
this.sessionId = sessionId;
this.pool = pool;
}
async getMessages() {
const result = await this.pool.query(
'SELECT role, content FROM messages WHERE session_id = $1 ORDER BY created_at',
[this.sessionId]
);
return result.rows.map(row =>
row.role === 'human'
? new HumanMessage(row.content)
: new AIMessage(row.content)
);
}
async addMessage(message) {
const role = message._getType() === 'human' ? 'human' : 'ai';
await this.pool.query(
'INSERT INTO messages (session_id, role, content, created_at) VALUES ($1, $2, $3, NOW())',
[this.sessionId, role, message.content]
);
}
async clear() {
await this.pool.query(
'DELETE FROM messages WHERE session_id = $1',
[this.sessionId]
);
}
}
Production memory architecture
+----------------------------------------------------------+
| CLIENT REQUEST |
| { sessionId: "user-123", input: "What did I say?" } |
+------------------+---------------------------------------+
|
v
+----------------------------------------------------------+
| MEMORY LAYER |
| |
| 1. Load history from persistent store (Redis/Postgres) |
| 2. If history too long: |
| a. Summarize old messages (ConversationSummaryMemory) |
| b. OR keep last N messages (WindowMemory) |
| c. OR retrieve relevant messages (VectorStoreMemory) |
| 3. Inject history into prompt template |
+------------------+---------------------------------------+
|
v
+----------------------------------------------------------+
| CHAIN: prompt -> model -> parser |
+------------------+---------------------------------------+
|
v
+----------------------------------------------------------+
| SAVE: Store new user message + AI response |
| back to persistent store |
+----------------------------------------------------------+
6. Combining Tools and Memory
The real power emerges when you combine tools and memory — an agent that can take actions AND remember previous interactions:
import { ChatOpenAI } from '@langchain/openai';
import { ChatPromptTemplate, MessagesPlaceholder } from '@langchain/core/prompts';
import { DynamicStructuredTool } from '@langchain/community/tools/dynamic';
import { z } from 'zod';
import { AgentExecutor, createOpenAIToolsAgent } from 'langchain/agents';
import { BufferMemory } from 'langchain/memory';
// Define tools
const lookupTool = new DynamicStructuredTool({
name: 'lookup_user_orders',
description: 'Look up all orders for a user by their email address.',
schema: z.object({
email: z.string().email().describe('The user email address')
}),
func: async ({ email }) => {
// Simulated database lookup
const orders = {
'alex@example.com': [
{ id: 'ORD-001', status: 'shipped', total: 59.99 },
{ id: 'ORD-002', status: 'delivered', total: 124.50 }
]
};
return JSON.stringify(orders[email] || []);
}
});
const refundTool = new DynamicStructuredTool({
name: 'process_refund',
description: 'Process a refund for an order. Only use when the user explicitly requests a refund.',
schema: z.object({
orderId: z.string().describe('The order ID to refund, e.g. ORD-001'),
reason: z.string().describe('The reason for the refund')
}),
func: async ({ orderId, reason }) => {
return JSON.stringify({
success: true,
refundId: `REF-${Date.now()}`,
message: `Refund initiated for order ${orderId}. Reason: ${reason}. Processing time: 3-5 business days.`
});
}
});
// Setup
const model = new ChatOpenAI({ modelName: 'gpt-4o', temperature: 0 });
const tools = [lookupTool, refundTool];
const prompt = ChatPromptTemplate.fromMessages([
['system', `You are a helpful customer service agent. Be polite and helpful.
When looking up orders, ask for the user's email if not provided.
Only process refunds when the user explicitly asks for one.`],
new MessagesPlaceholder('history'),
['user', '{input}'],
new MessagesPlaceholder('agent_scratchpad')
]);
const agent = await createOpenAIToolsAgent({ llm: model, tools, prompt });
const memory = new BufferMemory({
returnMessages: true,
memoryKey: 'history'
});
const executor = new AgentExecutor({
agent,
tools,
memory,
verbose: true // Log agent reasoning
});
// Multi-turn conversation with tools and memory
const r1 = await executor.invoke({
input: 'Hi, my email is alex@example.com. Can you check my orders?'
});
console.log(r1.output);
// "I found 2 orders: ORD-001 (shipped, $59.99) and ORD-002 (delivered, $124.50)."
const r2 = await executor.invoke({
input: 'Can I get a refund on the first one?'
});
console.log(r2.output);
// Agent remembers ORD-001 from the previous turn
// "I've processed a refund for ORD-001. Refund REF-... initiated. 3-5 business days."
const r3 = await executor.invoke({
input: 'Thanks! What was my total spending across both orders?'
});
console.log(r3.output);
// Agent remembers the order details: "$59.99 + $124.50 = $184.49 total."
7. Key Takeaways
- Tools extend LLM capabilities beyond text generation — search, calculate, query databases, call APIs. The tool's
descriptionis what the LLM reads to decide when to use it, so make it precise. - Custom tools are easy to create — use
DynamicToolfor simple string input,DynamicStructuredToolortool()for typed input with Zod schemas. - Memory makes conversations continuous — without it, every API call is stateless. Choose the right memory type based on conversation length and requirements.
- BufferMemory stores everything (simple but token-expensive), WindowMemory keeps last N turns (predictable but forgetful), SummaryMemory compresses history (compact but approximate), VectorStoreMemory retrieves semantically (smart but adds latency).
- Persistent memory requires external storage — Redis, PostgreSQL, or any database. In-memory storage is lost on restart.
- Tools + memory together enable powerful agents that can take actions and maintain context across a multi-turn conversation.
Explain-It Challenge
- A product manager asks "why does the chatbot forget what I said 5 minutes ago?" Explain the problem and propose a solution using LangChain memory.
- You are building a customer support bot that needs to look up orders, check inventory, and process refunds. Design the tool definitions (name, description, schema) for each tool.
- Compare BufferMemory and ConversationSummaryMemory for a chatbot that handles 100+ turn conversations. Which would you choose and why?
Navigation: <- 4.17.b Chains and Prompt Templates | 4.17.d — Working with Agents ->