Episode 4 — Generative AI Engineering / 4.17 — LangChain Practical

4.17 — LangChain Practical: Quick Revision

Compact cheat sheet. Print-friendly.

How to use this material (instructions)

  1. Skim before labs or interviews.
  2. Drill gaps -- reopen README.md then 4.17.a...4.17.e.
  3. Practice -- 4.17-Exercise-Questions.md.
  4. Polish answers -- 4.17-Interview-Questions.md.

Core vocabulary

TermOne-liner
LangChainOpen-source framework for building LLM-powered apps using composable components
ChainFixed pipeline of components: prompt -> model -> parser (steps defined at dev time)
Prompt templateReusable prompt with {variables} filled at runtime; validates required inputs
ChatPromptTemplatePrompt template producing a messages array (system + user + history); used 90% of the time
MessagesPlaceholderSlot in a template for injecting conversation history or agent scratchpad
Output parserExtracts structured data from model response (StringOutputParser, JsonOutputParser)
LCELLangChain Expression Language -- pipe-based composition of Runnables
RunnableAny component implementing .invoke(), .stream(), .batch(), .pipe()
ToolFunction an LLM can decide to call; has name, description, schema, and implementation
MemoryModule that stores and injects conversation history so chains remember past turns
AgentLLM-driven loop that dynamically decides which tools to call at runtime
AgentExecutorRuntime managing the think-act-observe loop, tool execution, and error handling
agent_scratchpadMessagesPlaceholder accumulating tool calls + results across agent iterations
RunnableSequenceRuns steps in order; output of step N becomes input of step N+1
RunnableParallelRuns multiple chains simultaneously on the same input; returns dict of results
RunnableLambdaWraps a plain function as a Runnable so it can be piped
RunnablePassthroughPasses input through unchanged; used to preserve original input alongside processed output
RunnableBranchRoutes input to different chains based on conditions
LangSmithObservability platform (not a library); traces every chain step, latency, tokens, cost
LangGraphFramework for stateful, multi-step agent workflows as directed graphs
withFallbacksTries primary model/chain; if it fails, automatically tries alternatives
withStructuredOutputBinds a Zod schema to a model so it returns validated structured data

LangChain architecture overview

+----------------------------------------------------------------------+
|  YOUR APPLICATION                                                     |
|                                                                       |
|  User Input -> PromptTemplate -> Model -> OutputParser -> Result      |
|                                                                       |
|  +----------------+  +-----------------+  +------------------+        |
|  |  TOOLS         |  |  MEMORY         |  |  AGENTS          |        |
|  |  - Search      |  |  - Buffer       |  |  - OpenAI Tools  |        |
|  |  - Calculator  |  |  - Window       |  |  - ReAct         |        |
|  |  - Custom      |  |  - Summary      |  |  - AgentExecutor |        |
|  |  - DynamicTool |  |  - VectorStore  |  |                  |        |
|  +----------------+  +-----------------+  +------------------+        |
+----------------------------------------------------------------------+
|  LCEL:  prompt.pipe(model).pipe(parser)                               |
|  Everything is a Runnable -> invoke / stream / batch / pipe           |
+----------------------------------------------------------------------+
|  ECOSYSTEM                                                            |
|  langchain (core)  |  langsmith (observability)  |  langgraph (DAGs)  |
+----------------------------------------------------------------------+

Chains and prompt templates patterns

BASIC THREE-STAGE PIPELINE (the fundamental pattern):

  ChatPromptTemplate  ->  ChatOpenAI  ->  StringOutputParser
  {variables}             messages         AIMessage -> string
// Minimal chain
const chain = prompt.pipe(model).pipe(parser);
const result = await chain.invoke({ topic: 'closures' });

Prompt template types

TypeProducesWhen to use
PromptTemplateSingle stringRarely -- legacy or simple completions
ChatPromptTemplate.fromMessages()Messages arrayAlmost always -- modern chat APIs
MessagesPlaceholderInjected message listConversation history or agent scratchpad

Output parser types

ParserOutputUse case
StringOutputParserstringPlain text responses
JsonOutputParserobjectJSON extraction (no schema validation)
.withStructuredOutput(Zod)Validated objectTyped, validated structured data

Sequential chain pattern

const chain = RunnableSequence.from([
  { text: summarizeChain, language: (input) => input.language },
  translateChain
]);

Tools and memory types

Tool anatomy

Tool:
  name:        string       "get_order_status"
  description: string       LLM reads this to decide WHEN to use it (CRITICAL)
  schema:      ZodSchema    Input validation (LLM sees field descriptions)
  func:        async fn     The actual implementation

Three ways to create tools

DynamicTool               -- simple string input
DynamicStructuredTool     -- typed input with Zod schema
tool()                    -- modern functional approach with Zod schema

Tool design rules

1. Descriptive names:      search_product_catalog > search
2. Detailed descriptions:  include WHEN to use, WHAT input format
3. Zod schemas:            .describe() on every field
4. Return strings:         always JSON.stringify() results
5. Handle errors:          return error string, don't throw
6. One tool per action:    don't bundle multiple actions

Memory comparison

MemoryStorageToken usageRecallExtra costBest for
BufferMemoryEvery messageGrows linearlyPerfectNoneShort chats (< 20 turns)
BufferWindowMemoryLast N exchangesFixedRecent onlyNoneLong sessions, recent context
SummaryMemoryRunning summaryGrows slowlyApproximate1 LLM call/turnLong chats needing gist
VectorStoreMemoryVector DBFixed retrievalSemanticEmbedding/turnMulti-topic conversations

Memory decision flowchart

Conversation < 20 turns?
  YES -> BufferMemory

Need exact recent recall?
  YES -> BufferWindowMemory (k = 5-15)

Need gist of entire conversation?
  YES -> ConversationSummaryMemory

Need recall of specific topics from long history?
  YES -> VectorStoreMemory

Production?
  -> Back with persistent storage (Redis / PostgreSQL)
  -> Use session IDs to isolate users

Agent loop flow

1. THINK    LLM receives question + tool descriptions + agent_scratchpad
            Decides: call a tool, or produce final answer

2. ACT      AgentExecutor runs the tool with LLM-specified arguments

3. OBSERVE  Tool result appended to agent_scratchpad

4. REPEAT   Back to step 1 with updated scratchpad

5. FINISH   LLM produces final answer (or maxIterations hit)
Iteration flow (example):

  scratchpad: []
  -> LLM: "I need to search"  -> search("Tokyo population")
  
  scratchpad: [search call + result]
  -> LLM: "Now I need to calculate"  -> calculator("14000000 / 13")
  
  scratchpad: [search call + result, calc call + result]
  -> LLM: "I have enough info"  -> FINAL ANSWER

Agent types

TypeProviderTool callingReliabilityUse when
OpenAI ToolsOpenAI onlyNative API tool_callsVery highProduction with OpenAI
ReActAny modelText parsing (Thought/Action/Observation)Medium-highMulti-provider
Structured ChatAny modelJSON in textMedium-highMulti-input tools

Agent safety checklist

maxIterations:        5-8 (prevent infinite loops)
handleParsingErrors:  true (send parse errors back to model)
earlyStoppingMethod:  'generate' (best-effort answer on limit)
Timeout:              30s wrapper (AbortController)
Tool errors:          return string, never throw
verbose:              true in dev, callbacks in prod

LCEL pipe syntax

Core Runnable protocol

interface Runnable<Input, Output> {
  invoke(input): Promise<Output>       // single call
  stream(input): AsyncGenerator<Output> // token-by-token
  batch(inputs[]): Promise<Output[]>   // parallel processing
  pipe(next): Runnable                 // connect to next step
}

LCEL building blocks

BlockPurposeExample
.pipe()Connect A -> B sequentiallyprompt.pipe(model).pipe(parser)
RunnableSequence.from([...])Explicit sequential stepsSteps with inline transforms
RunnableParallel.from({...})Run branches simultaneously{ sentiment: chainA, topic: chainB }
RunnableLambda.from(fn)Wrap plain function as RunnablePreprocessing, postprocessing
RunnablePassthroughForward input unchangedKeep original alongside processed
RunnablePassthrough.assign({})Add new keys to input{ wordCount: (x) => x.text.split(' ').length }
RunnableBranchRoute by conditionCode chain vs math chain vs general
.withFallbacks({...})Try alternatives on failureGPT-4o -> GPT-4o-mini -> Claude
.withRetry({...})Retry on transient errors{ stopAfterAttempt: 3 }

Common LCEL patterns

// 1. Basic chain
const chain = prompt.pipe(model).pipe(parser);

// 2. Parallel analysis
const analysis = RunnableParallel.from({
  sentiment: sentimentChain,
  topic: topicChain,
  language: languageChain
});

// 3. Sequential with transforms
const pipeline = RunnableSequence.from([
  (input) => ({ ...input, text: input.text.trim() }),   // preprocess
  prompt.pipe(model).pipe(parser),                       // LLM call
  (output) => ({ result: output, timestamp: Date.now() }) // postprocess
]);

// 4. Fallback chain
const reliable = primaryModel.withFallbacks({
  fallbacks: [cheaperModel, differentProvider]
});

// 5. Routing
const router = new RunnableBranch(
  [(x) => x.type === 'code', codeChain],
  [(x) => x.type === 'math', mathChain],
  generalChain  // default
);

// 6. Streaming (automatic on any LCEL chain)
const stream = await chain.stream({ input: 'Hello' });
for await (const chunk of stream) { process.stdout.write(chunk); }

When to use LangChain vs raw SDK calls

USE LANGCHAIN:
  - RAG pipelines (loaders + splitters + vector stores + retrieval)
  - Agent systems (tool calling, reasoning loops)
  - Multi-provider support (swap OpenAI <-> Anthropic in one line)
  - Rapid prototyping (working demo in hours)
  - LangSmith observability needed
  - Team projects with shared abstractions

USE RAW SDK:
  - Single API call, one prompt, one model
  - Latency-critical paths (no abstraction overhead)
  - Minimal dependencies (serverless, edge functions)
  - Full control over streaming/retry/error handling
  - Learning purposes (understand what LangChain abstracts)

HYBRID (common in production):
  - LangChain for complex parts (RAG, agents)
  - Raw SDK for simple, latency-critical parts (classification, quick completions)

Common gotchas

GotchaWhy it hurtsFix
Using legacy LLMChain / SequentialChainDeprecated, no streaming/batchingUse LCEL pipe syntax
Vague tool descriptionsAgent picks wrong toolWrite precise descriptions with examples and input format
BufferMemory in long conversationsExceeds context windowSwitch to WindowMemory, SummaryMemory, or VectorStoreMemory
Throwing errors inside toolsCrashes the agent loopReturn error as string so agent can recover
Forgetting MessagesPlaceholder('history')Memory injected but never reaches the promptAlways include the placeholder in your template
Forgetting MessagesPlaceholder('agent_scratchpad')Agent cannot see previous tool resultsRequired in every agent prompt template
No maxIterations on AgentExecutorInfinite loops burn tokens and timeAlways set maxIterations: 5-8
Expecting streaming from JsonOutputParserJSON must be complete to parseStream with StringOutputParser, parse after
Not enabling handleParsingErrorsOne bad model output kills the agentSet handleParsingErrors: true
In-memory storage in productionLost on server restartUse Redis, PostgreSQL, or other persistent store
Ignoring LCEL .batch()Processing N items sequentiallyUse chain.batch(items, { maxConcurrency: 5 })
Mixing up @langchain/core vs langchain importsWrong import path, subtle bugsCore interfaces from @langchain/core; chains/agents from langchain

Package map

@langchain/core        Core interfaces: prompts, parsers, runnables, messages
langchain              Framework: chains, agents, memory, document loaders
@langchain/openai      OpenAI models (ChatOpenAI)
@langchain/anthropic   Anthropic models (ChatAnthropic)
@langchain/community   Community tools, vector stores, integrations
LangSmith              Observability platform (env vars, no npm install)
LangGraph              Complex stateful agent workflows (separate package)

Version history (one-glance)

v0.0.x (2023)      LLMChain, SequentialChain            DEPRECATED
v0.1.x (2023-24)   LCEL introduced, legacy still works  TRANSITIONAL
v0.2.x (2024)      LCEL primary, legacy deprecated       STABLE
v0.3.x (2024-25)   Modular packages, cleaner imports     LATEST

KEY RULE: if you see LLMChain or SequentialChain in a tutorial, it's legacy.
          Always use:  prompt.pipe(model).pipe(parser)

Quick mental model

LangChain = composable building blocks for LLM apps

  Prompt Template  -- format the input
  Model            -- call the LLM
  Output Parser    -- extract structured output
  Tool             -- give the LLM actions (search, calculate, API calls)
  Memory           -- make conversations stateful
  Agent            -- let the LLM decide which tools to call
  LCEL             -- pipe everything together: invoke / stream / batch for free

Decision:
  Fixed steps?       -> Chain   (prompt | model | parser)
  Dynamic steps?     -> Agent   (think-act-observe loop)
  Simple API call?   -> Raw SDK (no framework needed)
  Complex workflow?  -> LangGraph (directed graph of agents)

End of 4.17 quick revision.