Episode 4 — Generative AI Engineering / 4.17 — LangChain Practical

4.17 — LangChain Practical: Quick Revision

Compact cheat sheet. Print-friendly.

How to use this material (instructions)

Skim before labs or interviews.
Drill gaps -- reopen README.md then 4.17.a...4.17.e.
Practice -- 4.17-Exercise-Questions.md.
Polish answers -- 4.17-Interview-Questions.md.

Core vocabulary

Term	One-liner
LangChain	Open-source framework for building LLM-powered apps using composable components
Chain	Fixed pipeline of components: prompt -> model -> parser (steps defined at dev time)
Prompt template	Reusable prompt with `{variables}` filled at runtime; validates required inputs
ChatPromptTemplate	Prompt template producing a messages array (system + user + history); used 90% of the time
MessagesPlaceholder	Slot in a template for injecting conversation history or agent scratchpad
Output parser	Extracts structured data from model response (`StringOutputParser`, `JsonOutputParser`)
LCEL	LangChain Expression Language -- pipe-based composition of Runnables
Runnable	Any component implementing `.invoke()`, `.stream()`, `.batch()`, `.pipe()`
Tool	Function an LLM can decide to call; has name, description, schema, and implementation
Memory	Module that stores and injects conversation history so chains remember past turns
Agent	LLM-driven loop that dynamically decides which tools to call at runtime
AgentExecutor	Runtime managing the think-act-observe loop, tool execution, and error handling
agent_scratchpad	MessagesPlaceholder accumulating tool calls + results across agent iterations
RunnableSequence	Runs steps in order; output of step N becomes input of step N+1
RunnableParallel	Runs multiple chains simultaneously on the same input; returns dict of results
RunnableLambda	Wraps a plain function as a Runnable so it can be piped
RunnablePassthrough	Passes input through unchanged; used to preserve original input alongside processed output
RunnableBranch	Routes input to different chains based on conditions
LangSmith	Observability platform (not a library); traces every chain step, latency, tokens, cost
LangGraph	Framework for stateful, multi-step agent workflows as directed graphs
withFallbacks	Tries primary model/chain; if it fails, automatically tries alternatives
withStructuredOutput	Binds a Zod schema to a model so it returns validated structured data

LangChain architecture overview

+----------------------------------------------------------------------+
|  YOUR APPLICATION                                                     |
|                                                                       |
|  User Input -> PromptTemplate -> Model -> OutputParser -> Result      |
|                                                                       |
|  +----------------+  +-----------------+  +------------------+        |
|  |  TOOLS         |  |  MEMORY         |  |  AGENTS          |        |
|  |  - Search      |  |  - Buffer       |  |  - OpenAI Tools  |        |
|  |  - Calculator  |  |  - Window       |  |  - ReAct         |        |
|  |  - Custom      |  |  - Summary      |  |  - AgentExecutor |        |
|  |  - DynamicTool |  |  - VectorStore  |  |                  |        |
|  +----------------+  +-----------------+  +------------------+        |
+----------------------------------------------------------------------+
|  LCEL:  prompt.pipe(model).pipe(parser)                               |
|  Everything is a Runnable -> invoke / stream / batch / pipe           |
+----------------------------------------------------------------------+
|  ECOSYSTEM                                                            |
|  langchain (core)  |  langsmith (observability)  |  langgraph (DAGs)  |
+----------------------------------------------------------------------+

Chains and prompt templates patterns

BASIC THREE-STAGE PIPELINE (the fundamental pattern):

  ChatPromptTemplate  ->  ChatOpenAI  ->  StringOutputParser
  {variables}             messages         AIMessage -> string

// Minimal chain
const chain = prompt.pipe(model).pipe(parser);
const result = await chain.invoke({ topic: 'closures' });

Prompt template types

Type	Produces	When to use
`PromptTemplate`	Single string	Rarely -- legacy or simple completions
`ChatPromptTemplate.fromMessages()`	Messages array	Almost always -- modern chat APIs
`MessagesPlaceholder`	Injected message list	Conversation history or agent scratchpad

Output parser types

Parser	Output	Use case
`StringOutputParser`	`string`	Plain text responses
`JsonOutputParser`	`object`	JSON extraction (no schema validation)
`.withStructuredOutput(Zod)`	Validated object	Typed, validated structured data

Sequential chain pattern

const chain = RunnableSequence.from([
  { text: summarizeChain, language: (input) => input.language },
  translateChain
]);

Tools and memory types

Tool anatomy

Tool:
  name:        string       "get_order_status"
  description: string       LLM reads this to decide WHEN to use it (CRITICAL)
  schema:      ZodSchema    Input validation (LLM sees field descriptions)
  func:        async fn     The actual implementation

Three ways to create tools

DynamicTool               -- simple string input
DynamicStructuredTool     -- typed input with Zod schema
tool()                    -- modern functional approach with Zod schema

Tool design rules

1. Descriptive names:      search_product_catalog > search
2. Detailed descriptions:  include WHEN to use, WHAT input format
3. Zod schemas:            .describe() on every field
4. Return strings:         always JSON.stringify() results
5. Handle errors:          return error string, don't throw
6. One tool per action:    don't bundle multiple actions

Memory comparison

Memory	Storage	Token usage	Recall	Extra cost	Best for
BufferMemory	Every message	Grows linearly	Perfect	None	Short chats (< 20 turns)
BufferWindowMemory	Last N exchanges	Fixed	Recent only	None	Long sessions, recent context
SummaryMemory	Running summary	Grows slowly	Approximate	1 LLM call/turn	Long chats needing gist
VectorStoreMemory	Vector DB	Fixed retrieval	Semantic	Embedding/turn	Multi-topic conversations

Memory decision flowchart

Conversation < 20 turns?
  YES -> BufferMemory

Need exact recent recall?
  YES -> BufferWindowMemory (k = 5-15)

Need gist of entire conversation?
  YES -> ConversationSummaryMemory

Need recall of specific topics from long history?
  YES -> VectorStoreMemory

Production?
  -> Back with persistent storage (Redis / PostgreSQL)
  -> Use session IDs to isolate users

Agent loop flow

1. THINK    LLM receives question + tool descriptions + agent_scratchpad
            Decides: call a tool, or produce final answer

2. ACT      AgentExecutor runs the tool with LLM-specified arguments

3. OBSERVE  Tool result appended to agent_scratchpad

4. REPEAT   Back to step 1 with updated scratchpad

5. FINISH   LLM produces final answer (or maxIterations hit)

Iteration flow (example):

  scratchpad: []
  -> LLM: "I need to search"  -> search("Tokyo population")
  
  scratchpad: [search call + result]
  -> LLM: "Now I need to calculate"  -> calculator("14000000 / 13")
  
  scratchpad: [search call + result, calc call + result]
  -> LLM: "I have enough info"  -> FINAL ANSWER

Agent types

Type	Provider	Tool calling	Reliability	Use when
OpenAI Tools	OpenAI only	Native API `tool_calls`	Very high	Production with OpenAI
ReAct	Any model	Text parsing (Thought/Action/Observation)	Medium-high	Multi-provider
Structured Chat	Any model	JSON in text	Medium-high	Multi-input tools

Agent safety checklist

maxIterations:        5-8 (prevent infinite loops)
handleParsingErrors:  true (send parse errors back to model)
earlyStoppingMethod:  'generate' (best-effort answer on limit)
Timeout:              30s wrapper (AbortController)
Tool errors:          return string, never throw
verbose:              true in dev, callbacks in prod

LCEL pipe syntax

Core Runnable protocol

interface Runnable<Input, Output> {
  invoke(input): Promise<Output>       // single call
  stream(input): AsyncGenerator<Output> // token-by-token
  batch(inputs[]): Promise<Output[]>   // parallel processing
  pipe(next): Runnable                 // connect to next step
}

LCEL building blocks

Block	Purpose	Example
`.pipe()`	Connect A -> B sequentially	`prompt.pipe(model).pipe(parser)`
`RunnableSequence.from([...])`	Explicit sequential steps	Steps with inline transforms
`RunnableParallel.from({...})`	Run branches simultaneously	`{ sentiment: chainA, topic: chainB }`
`RunnableLambda.from(fn)`	Wrap plain function as Runnable	Preprocessing, postprocessing
`RunnablePassthrough`	Forward input unchanged	Keep original alongside processed
`RunnablePassthrough.assign({})`	Add new keys to input	`{ wordCount: (x) => x.text.split(' ').length }`
`RunnableBranch`	Route by condition	Code chain vs math chain vs general
`.withFallbacks({...})`	Try alternatives on failure	GPT-4o -> GPT-4o-mini -> Claude
`.withRetry({...})`	Retry on transient errors	`{ stopAfterAttempt: 3 }`

Common LCEL patterns

// 1. Basic chain
const chain = prompt.pipe(model).pipe(parser);

// 2. Parallel analysis
const analysis = RunnableParallel.from({
  sentiment: sentimentChain,
  topic: topicChain,
  language: languageChain
});

// 3. Sequential with transforms
const pipeline = RunnableSequence.from([
  (input) => ({ ...input, text: input.text.trim() }),   // preprocess
  prompt.pipe(model).pipe(parser),                       // LLM call
  (output) => ({ result: output, timestamp: Date.now() }) // postprocess
]);

// 4. Fallback chain
const reliable = primaryModel.withFallbacks({
  fallbacks: [cheaperModel, differentProvider]
});

// 5. Routing
const router = new RunnableBranch(
  [(x) => x.type === 'code', codeChain],
  [(x) => x.type === 'math', mathChain],
  generalChain  // default
);

// 6. Streaming (automatic on any LCEL chain)
const stream = await chain.stream({ input: 'Hello' });
for await (const chunk of stream) { process.stdout.write(chunk); }

When to use LangChain vs raw SDK calls

USE LANGCHAIN:
  - RAG pipelines (loaders + splitters + vector stores + retrieval)
  - Agent systems (tool calling, reasoning loops)
  - Multi-provider support (swap OpenAI <-> Anthropic in one line)
  - Rapid prototyping (working demo in hours)
  - LangSmith observability needed
  - Team projects with shared abstractions

USE RAW SDK:
  - Single API call, one prompt, one model
  - Latency-critical paths (no abstraction overhead)
  - Minimal dependencies (serverless, edge functions)
  - Full control over streaming/retry/error handling
  - Learning purposes (understand what LangChain abstracts)

HYBRID (common in production):
  - LangChain for complex parts (RAG, agents)
  - Raw SDK for simple, latency-critical parts (classification, quick completions)

Common gotchas

Gotcha	Why it hurts	Fix
Using legacy `LLMChain` / `SequentialChain`	Deprecated, no streaming/batching	Use LCEL pipe syntax
Vague tool descriptions	Agent picks wrong tool	Write precise descriptions with examples and input format
BufferMemory in long conversations	Exceeds context window	Switch to WindowMemory, SummaryMemory, or VectorStoreMemory
Throwing errors inside tools	Crashes the agent loop	Return error as string so agent can recover
Forgetting `MessagesPlaceholder('history')`	Memory injected but never reaches the prompt	Always include the placeholder in your template
Forgetting `MessagesPlaceholder('agent_scratchpad')`	Agent cannot see previous tool results	Required in every agent prompt template
No `maxIterations` on AgentExecutor	Infinite loops burn tokens and time	Always set `maxIterations: 5-8`
Expecting streaming from `JsonOutputParser`	JSON must be complete to parse	Stream with `StringOutputParser`, parse after
Not enabling `handleParsingErrors`	One bad model output kills the agent	Set `handleParsingErrors: true`
In-memory storage in production	Lost on server restart	Use Redis, PostgreSQL, or other persistent store
Ignoring LCEL `.batch()`	Processing N items sequentially	Use `chain.batch(items, { maxConcurrency: 5 })`
Mixing up `@langchain/core` vs `langchain` imports	Wrong import path, subtle bugs	Core interfaces from `@langchain/core`; chains/agents from `langchain`

Package map

@langchain/core        Core interfaces: prompts, parsers, runnables, messages
langchain              Framework: chains, agents, memory, document loaders
@langchain/openai      OpenAI models (ChatOpenAI)
@langchain/anthropic   Anthropic models (ChatAnthropic)
@langchain/community   Community tools, vector stores, integrations
LangSmith              Observability platform (env vars, no npm install)
LangGraph              Complex stateful agent workflows (separate package)

Version history (one-glance)

v0.0.x (2023)      LLMChain, SequentialChain            DEPRECATED
v0.1.x (2023-24)   LCEL introduced, legacy still works  TRANSITIONAL
v0.2.x (2024)      LCEL primary, legacy deprecated       STABLE
v0.3.x (2024-25)   Modular packages, cleaner imports     LATEST

KEY RULE: if you see LLMChain or SequentialChain in a tutorial, it's legacy.
          Always use:  prompt.pipe(model).pipe(parser)

Quick mental model

LangChain = composable building blocks for LLM apps

  Prompt Template  -- format the input
  Model            -- call the LLM
  Output Parser    -- extract structured output
  Tool             -- give the LLM actions (search, calculate, API calls)
  Memory           -- make conversations stateful
  Agent            -- let the LLM decide which tools to call
  LCEL             -- pipe everything together: invoke / stream / batch for free

Decision:
  Fixed steps?       -> Chain   (prompt | model | parser)
  Dynamic steps?     -> Agent   (think-act-observe loop)
  Simple API call?   -> Raw SDK (no framework needed)
  Complex workflow?  -> LangGraph (directed graph of agents)

End of 4.17 quick revision.