Episode 4 — Generative AI Engineering / 4.17 — LangChain Practical
4.17.e — LCEL Overview
In one sentence: LCEL (LangChain Expression Language) is the declarative composition layer that lets you pipe Runnables together with
prompt | model | parsersyntax — giving you streaming, batching, parallel execution, branching, and fallbacks for free, and is the reason LangChain deprecated all legacy chain classes.
Navigation: <- 4.17.d Working with Agents | 4.17 Overview ->
1. What Is LCEL?
LCEL (LangChain Expression Language) is LangChain's composition system. It defines how components connect together. Every component in LangChain implements the Runnable interface, and LCEL provides the operators and utilities to wire Runnables into complex pipelines.
The core idea
Any Runnable can be piped to any other Runnable.
The output of one becomes the input of the next.
prompt.pipe(model).pipe(parser)
This creates a new Runnable that:
1. Takes the prompt's input type (a dictionary of variables)
2. Produces the parser's output type (a string, JSON object, etc.)
3. Automatically supports .invoke(), .stream(), .batch()
Before LCEL (legacy chains)
// LEGACY — class-based, rigid, hard to compose
import { LLMChain } from 'langchain/chains';
import { SequentialChain } from 'langchain/chains';
const chain1 = new LLMChain({ llm: model, prompt: prompt1, outputKey: 'summary' });
const chain2 = new LLMChain({ llm: model, prompt: prompt2, outputKey: 'translation' });
const sequential = new SequentialChain({
chains: [chain1, chain2],
inputVariables: ['text'],
outputVariables: ['translation']
});
After LCEL (modern approach)
// MODERN — pipe-based, flexible, composable
import { RunnableSequence } from '@langchain/core/runnables';
const summarize = prompt1.pipe(model).pipe(parser);
const translate = prompt2.pipe(model).pipe(parser);
const pipeline = RunnableSequence.from([
{ summary: summarize },
(input) => ({ text: input.summary, language: 'French' }),
translate
]);
2. The Pipe Operator
The .pipe() method is the fundamental building block. It connects two Runnables in sequence: the output of the first flows into the input of the second.
Basic piping
import { ChatOpenAI } from '@langchain/openai';
import { ChatPromptTemplate } from '@langchain/core/prompts';
import { StringOutputParser } from '@langchain/core/output_parsers';
const prompt = ChatPromptTemplate.fromMessages([
['system', 'You are a helpful assistant.'],
['user', '{input}']
]);
const model = new ChatOpenAI({ modelName: 'gpt-4o' });
const parser = new StringOutputParser();
// Pipe: prompt -> model -> parser
const chain = prompt.pipe(model).pipe(parser);
// What happens at each stage:
// 1. prompt.invoke({input: "Hello"})
// → [SystemMessage("You are a helpful assistant."), HumanMessage("Hello")]
//
// 2. model.invoke(messages)
// → AIMessage({ content: "Hi! How can I help?" })
//
// 3. parser.invoke(aiMessage)
// → "Hi! How can I help?"
const result = await chain.invoke({ input: 'Hello' });
// "Hi! How can I help?"
Every pipe creates a new Runnable
const step1 = prompt.pipe(model); // Runnable: {variables} -> AIMessage
const step2 = step1.pipe(parser); // Runnable: {variables} -> string
const step3 = step2.pipe(someTransform); // Runnable: {variables} -> transformed output
// step2 is independent of step3
const result2 = await step2.invoke({ input: 'Hello' }); // Still works
The Runnable interface
Every component in LCEL implements these methods:
// The Runnable Protocol
interface Runnable<Input, Output> {
invoke(input: Input): Promise<Output>; // Single input -> single output
stream(input: Input): AsyncGenerator<Output>; // Single input -> streamed output
batch(inputs: Input[]): Promise<Output[]>; // Multiple inputs -> multiple outputs
pipe(next: Runnable): Runnable; // Connect to next Runnable
}
// Because every pipe creates a Runnable, the CHAIN itself supports all these methods:
const chain = prompt.pipe(model).pipe(parser);
await chain.invoke({ input: 'Hello' }); // Single call
await chain.stream({ input: 'Hello' }); // Streaming
await chain.batch([{ input: 'A' }, { input: 'B' }]); // Batch
3. RunnableSequence
RunnableSequence is the explicit version of piping. It takes an array of steps and runs them in order. This is useful when you need to build chains dynamically or insert transformation functions between components.
import { RunnableSequence } from '@langchain/core/runnables';
import { ChatOpenAI } from '@langchain/openai';
import { ChatPromptTemplate } from '@langchain/core/prompts';
import { StringOutputParser } from '@langchain/core/output_parsers';
const model = new ChatOpenAI({ modelName: 'gpt-4o', temperature: 0 });
const parser = new StringOutputParser();
const chain = RunnableSequence.from([
// Step 1: Format the prompt
ChatPromptTemplate.fromMessages([
['system', 'Extract all proper nouns from the text.'],
['user', '{text}']
]),
// Step 2: Call the model
model,
// Step 3: Parse to string
parser,
// Step 4: Transform the output (plain function)
(output) => ({
nouns: output.split(',').map(n => n.trim()),
count: output.split(',').length
})
]);
const result = await chain.invoke({
text: 'Barack Obama met with Angela Merkel in Berlin to discuss NATO.'
});
console.log(result);
// { nouns: ["Barack Obama", "Angela Merkel", "Berlin", "NATO"], count: 4 }
Inline functions in sequences
Plain functions are automatically wrapped as RunnableLambda inside a RunnableSequence:
const chain = RunnableSequence.from([
// Function: preprocess input
(input) => ({ ...input, text: input.text.toLowerCase().trim() }),
// Prompt template
ChatPromptTemplate.fromMessages([
['system', 'Classify the sentiment of this text as positive, negative, or neutral.'],
['user', '{text}']
]),
// Model
model,
// Parser
parser,
// Function: postprocess output
(output) => ({
sentiment: output.trim().toLowerCase(),
timestamp: new Date().toISOString()
})
]);
4. RunnableParallel
RunnableParallel runs multiple Runnables simultaneously on the same input and returns all results as a dictionary. This is extremely useful when you need to extract multiple pieces of information from the same input.
import { RunnableParallel } from '@langchain/core/runnables';
import { ChatOpenAI } from '@langchain/openai';
import { ChatPromptTemplate } from '@langchain/core/prompts';
import { StringOutputParser } from '@langchain/core/output_parsers';
const model = new ChatOpenAI({ modelName: 'gpt-4o', temperature: 0 });
const parser = new StringOutputParser();
// Three different analyses run in parallel
const sentimentChain = ChatPromptTemplate.fromMessages([
['system', 'Classify sentiment as positive, negative, or neutral. Reply with one word.'],
['user', '{text}']
]).pipe(model).pipe(parser);
const topicChain = ChatPromptTemplate.fromMessages([
['system', 'Extract the main topic in 3 words or fewer.'],
['user', '{text}']
]).pipe(model).pipe(parser);
const languageChain = ChatPromptTemplate.fromMessages([
['system', 'Detect the language. Reply with the language name only.'],
['user', '{text}']
]).pipe(model).pipe(parser);
// Run all three in parallel
const analysisChain = RunnableParallel.from({
sentiment: sentimentChain,
topic: topicChain,
language: languageChain
});
const result = await analysisChain.invoke({
text: 'LangChain makes it really easy to build AI applications. I love it!'
});
console.log(result);
// {
// sentiment: "positive",
// topic: "AI development tools",
// language: "English"
// }
Parallel + Sequential combined
import { RunnableSequence, RunnableParallel } from '@langchain/core/runnables';
const pipeline = RunnableSequence.from([
// Step 1: Run analyses in parallel
RunnableParallel.from({
sentiment: sentimentChain,
topic: topicChain,
language: languageChain,
originalText: (input) => input.text // Pass through
}),
// Step 2: Use the parallel results to generate a report
(results) => ({
report: `Analysis Report:
- Text: "${results.originalText.substring(0, 50)}..."
- Sentiment: ${results.sentiment}
- Topic: ${results.topic}
- Language: ${results.language}`
})
]);
const report = await pipeline.invoke({
text: 'LangChain makes building AI applications really straightforward.'
});
console.log(report);
// {
// report: "Analysis Report:
// - Text: \"LangChain makes building AI applications really...\"
// - Sentiment: positive
// - Topic: AI development
// - Language: English"
// }
Using object syntax (shorthand for RunnableParallel)
When you pass an object as a step in RunnableSequence.from(), it is automatically treated as a RunnableParallel:
const chain = RunnableSequence.from([
// This object is automatically a RunnableParallel
{
summary: summarizeChain,
keywords: keywordChain,
wordCount: (input) => input.text.split(' ').length
},
// Next step receives { summary, keywords, wordCount }
formatResultChain
]);
5. RunnableLambda
RunnableLambda wraps a plain function as a Runnable, giving it the full Runnable interface (invoke, stream, batch, pipe).
import { RunnableLambda } from '@langchain/core/runnables';
// Wrap a function as a Runnable
const upperCase = RunnableLambda.from((input) => input.toUpperCase());
const addTimestamp = RunnableLambda.from((input) => ({
text: input,
processedAt: new Date().toISOString()
}));
// Now it's pipeable
const chain = prompt.pipe(model).pipe(parser).pipe(upperCase).pipe(addTimestamp);
const result = await chain.invoke({ input: 'hello' });
// { text: "HI! HOW CAN I HELP YOU?", processedAt: "2025-04-11T10:30:00.000Z" }
Async functions
const fetchData = RunnableLambda.from(async (input) => {
const response = await fetch(`https://api.example.com/data/${input.id}`);
const data = await response.json();
return { ...input, data };
});
const chain = fetchData.pipe(prompt).pipe(model).pipe(parser);
Error handling in lambdas
const safeParse = RunnableLambda.from((input) => {
try {
return JSON.parse(input);
} catch (error) {
return { error: `Failed to parse JSON: ${error.message}`, raw: input };
}
});
6. Streaming with LCEL
One of LCEL's biggest advantages: streaming works automatically through the entire chain. When you call .stream(), every component that supports streaming passes chunks through without waiting for the full output.
Basic streaming
import { ChatOpenAI } from '@langchain/openai';
import { ChatPromptTemplate } from '@langchain/core/prompts';
import { StringOutputParser } from '@langchain/core/output_parsers';
const chain = ChatPromptTemplate.fromMessages([
['system', 'You are a storyteller.'],
['user', 'Tell a short story about {topic}.']
]).pipe(
new ChatOpenAI({ modelName: 'gpt-4o', temperature: 0.8 })
).pipe(
new StringOutputParser()
);
// Stream token-by-token
const stream = await chain.stream({ topic: 'a robot learning to cook' });
for await (const chunk of stream) {
process.stdout.write(chunk);
// Prints: "Once" "upon" "a" "time" "," "in" "a" "small" ...
}
Streaming with HTTP (Express example)
import express from 'express';
const app = express();
app.post('/api/chat', async (req, res) => {
const { message } = req.body;
// Set headers for Server-Sent Events
res.setHeader('Content-Type', 'text/event-stream');
res.setHeader('Cache-Control', 'no-cache');
res.setHeader('Connection', 'keep-alive');
const stream = await chain.stream({ input: message });
for await (const chunk of stream) {
res.write(`data: ${JSON.stringify({ token: chunk })}\n\n`);
}
res.write('data: [DONE]\n\n');
res.end();
});
Streaming through complex chains
import { RunnableSequence, RunnableLambda } from '@langchain/core/runnables';
// Streaming works through the entire sequence
const chain = RunnableSequence.from([
ChatPromptTemplate.fromMessages([
['system', 'You are a translator.'],
['user', 'Translate to {language}: {text}']
]),
new ChatOpenAI({ modelName: 'gpt-4o', streaming: true }),
new StringOutputParser()
]);
// The stream flows through: prompt (instant) -> model (streams tokens) -> parser (passes through)
const stream = await chain.stream({
language: 'Spanish',
text: 'Hello, how are you?'
});
let fullOutput = '';
for await (const chunk of stream) {
fullOutput += chunk;
process.stdout.write(chunk);
}
// fullOutput: "Hola, como estas?"
Stream events (detailed streaming)
For more granular control, streamEvents gives you events for every step in the chain:
const chain = prompt.pipe(model).pipe(parser);
const eventStream = chain.streamEvents(
{ input: 'Tell me about JavaScript' },
{ version: 'v2' }
);
for await (const event of eventStream) {
if (event.event === 'on_llm_stream') {
// Token from the LLM
process.stdout.write(event.data.chunk.content || '');
} else if (event.event === 'on_chain_start') {
console.log(`\n[Chain started: ${event.name}]`);
} else if (event.event === 'on_chain_end') {
console.log(`\n[Chain ended: ${event.name}]`);
}
}
7. Branching and Routing
LCEL supports dynamic routing — sending input to different chains based on conditions.
RunnableBranch
import { RunnableBranch, RunnableLambda } from '@langchain/core/runnables';
import { ChatOpenAI } from '@langchain/openai';
import { ChatPromptTemplate } from '@langchain/core/prompts';
import { StringOutputParser } from '@langchain/core/output_parsers';
const model = new ChatOpenAI({ modelName: 'gpt-4o', temperature: 0 });
const parser = new StringOutputParser();
// Different chains for different input types
const codeChain = ChatPromptTemplate.fromMessages([
['system', 'You are a code expert. Explain the code step by step.'],
['user', '{input}']
]).pipe(model).pipe(parser);
const mathChain = ChatPromptTemplate.fromMessages([
['system', 'You are a math tutor. Solve the problem step by step.'],
['user', '{input}']
]).pipe(model).pipe(parser);
const generalChain = ChatPromptTemplate.fromMessages([
['system', 'You are a helpful assistant.'],
['user', '{input}']
]).pipe(model).pipe(parser);
// Route based on input classification
const classifyChain = ChatPromptTemplate.fromMessages([
['system', 'Classify the input as "code", "math", or "general". Reply with one word only.'],
['user', '{input}']
]).pipe(model).pipe(parser);
// Build the router
const router = RunnableSequence.from([
{
category: classifyChain,
input: (input) => input.input
},
new RunnableBranch(
[
(x) => x.category.trim().toLowerCase() === 'code',
(x) => codeChain.invoke({ input: x.input })
],
[
(x) => x.category.trim().toLowerCase() === 'math',
(x) => mathChain.invoke({ input: x.input })
],
// Default branch
(x) => generalChain.invoke({ input: x.input })
)
]);
// Test routing
const r1 = await router.invoke({ input: 'What does Array.map() do in JavaScript?' });
// Routes to codeChain
const r2 = await router.invoke({ input: 'What is the derivative of x^2 + 3x?' });
// Routes to mathChain
const r3 = await router.invoke({ input: 'What is the capital of France?' });
// Routes to generalChain
Custom routing with RunnableLambda
import { RunnableLambda } from '@langchain/core/runnables';
const routeByLanguage = RunnableLambda.from(async (input) => {
const languageMap = {
javascript: jsExpertChain,
python: pythonExpertChain,
rust: rustExpertChain
};
const language = input.language?.toLowerCase() || 'javascript';
const chain = languageMap[language] || generalChain;
return chain.invoke({ input: input.question });
});
const result = await routeByLanguage.invoke({
language: 'python',
question: 'How do decorators work?'
});
8. Fallbacks
LCEL supports fallbacks — if one Runnable fails, automatically try another. This is essential for production reliability.
import { ChatOpenAI } from '@langchain/openai';
import { ChatAnthropic } from '@langchain/anthropic';
import { ChatPromptTemplate } from '@langchain/core/prompts';
import { StringOutputParser } from '@langchain/core/output_parsers';
const parser = new StringOutputParser();
// Primary model
const primaryModel = new ChatOpenAI({ modelName: 'gpt-4o', temperature: 0 });
// Fallback model (different provider!)
const fallbackModel = new ChatAnthropic({ modelName: 'claude-sonnet-4-20250514', temperature: 0 });
// Create model with fallback
const reliableModel = primaryModel.withFallbacks({
fallbacks: [fallbackModel]
});
// If OpenAI is down, automatically falls back to Claude
const chain = ChatPromptTemplate.fromMessages([
['system', 'You are a helpful assistant.'],
['user', '{input}']
]).pipe(reliableModel).pipe(parser);
const result = await chain.invoke({ input: 'Hello!' });
// Tries OpenAI first. If it fails (rate limit, outage, error),
// automatically retries with Claude.
Multi-level fallbacks
const model = new ChatOpenAI({ modelName: 'gpt-4o' })
.withFallbacks({
fallbacks: [
new ChatOpenAI({ modelName: 'gpt-4o-mini' }), // Try cheaper model
new ChatAnthropic({ modelName: 'claude-sonnet-4-20250514' }), // Try different provider
]
});
// Fallback chain: gpt-4o -> gpt-4o-mini -> claude-sonnet
Chain-level fallbacks
const primaryChain = prompt1.pipe(model1).pipe(parser);
const fallbackChain = prompt2.pipe(model2).pipe(parser);
const reliableChain = primaryChain.withFallbacks({
fallbacks: [fallbackChain]
});
9. Why LCEL Replaced Legacy Chain Classes
| Feature | Legacy Chains | LCEL |
|---|---|---|
| Composition | Fixed set of chain types (LLMChain, SequentialChain, etc.) | Any Runnable composes with any other Runnable |
| Streaming | Not supported or requires custom implementation | Built-in on every chain via .stream() |
| Batching | Manual loops with Promise.all | Built-in via .batch() with concurrency control |
| Parallel execution | Not natively supported | RunnableParallel runs branches simultaneously |
| Branching | Requires custom code | RunnableBranch built-in |
| Fallbacks | Manual try/catch | .withFallbacks() built-in |
| Type flow | Loose (dict in, dict out) | Input/output types flow through the chain |
| Observability | Limited | Every step visible in LangSmith traces |
| Testability | Test the whole chain | Test each component independently |
The key insight
Legacy chains were classes — each chain type was a specific class with specific behavior. If your use case didn't fit a predefined chain type, you were stuck.
LCEL is a protocol — any component that implements the Runnable interface can be piped to any other component. This means you can build any pipeline shape without being limited to predefined patterns.
10. Common LCEL Patterns
Pattern 1: Passthrough (keep original input)
import { RunnablePassthrough } from '@langchain/core/runnables';
// Include original input alongside model output
const chain = RunnableSequence.from([
{
original: new RunnablePassthrough(), // Pass input through unchanged
processed: prompt.pipe(model).pipe(parser) // Process with LLM
},
(result) => ({
input: result.original.text,
output: result.processed,
processedAt: new Date().toISOString()
})
]);
Pattern 2: Assign (add new keys to input)
import { RunnablePassthrough } from '@langchain/core/runnables';
// RunnablePassthrough.assign() adds new keys while keeping existing ones
const chain = RunnablePassthrough.assign({
// Add a 'wordCount' key to the input
wordCount: (input) => input.text.split(' ').length,
// Add a 'summary' key from LLM
summary: summarizeChain
}).pipe(
// Next step receives { text: "...", wordCount: 42, summary: "..." }
formatChain
);
Pattern 3: Dynamic chain selection
const modelMap = {
fast: new ChatOpenAI({ modelName: 'gpt-4o-mini' }),
quality: new ChatOpenAI({ modelName: 'gpt-4o' }),
creative: new ChatOpenAI({ modelName: 'gpt-4o', temperature: 1.2 })
};
const dynamicChain = RunnableLambda.from(async (input) => {
const model = modelMap[input.mode] || modelMap.fast;
const chain = prompt.pipe(model).pipe(parser);
return chain.invoke({ input: input.text });
});
await dynamicChain.invoke({ mode: 'quality', text: 'Explain quantum computing' });
await dynamicChain.invoke({ mode: 'fast', text: 'What is 2+2?' });
Pattern 4: Retry with exponential backoff
import { ChatOpenAI } from '@langchain/openai';
const model = new ChatOpenAI({
modelName: 'gpt-4o',
maxRetries: 3 // Built-in retry on transient errors
});
// For custom retry logic on the chain level
const chain = prompt.pipe(model).pipe(parser).withRetry({
stopAfterAttempt: 3
});
11. Key Takeaways
- LCEL is the composition layer — it defines how components connect via the
.pipe()operator and the Runnable protocol. - Every Runnable supports invoke, stream, batch — building a chain with LCEL gives you all three execution modes automatically.
- RunnableSequence runs steps in order, RunnableParallel runs steps simultaneously, RunnableLambda wraps plain functions.
- Streaming works through the entire chain — call
.stream()on any LCEL chain and tokens flow through from the model to the consumer without buffering. - RunnableBranch enables dynamic routing — send input to different chains based on conditions (content type, language, complexity).
- Fallbacks provide production reliability —
.withFallbacks()automatically tries alternative models or chains when the primary fails. - LCEL replaced legacy chains because it is composable (any Runnable works), streamable, batchable, and observable by default — no special handling needed.
- Use the object shorthand for parallel execution — passing
{ key1: chain1, key2: chain2 }as a step in a sequence automatically runs the chains in parallel.
Explain-It Challenge
-
Rewrite this legacy code using LCEL:
const chain = new LLMChain({ llm: model, prompt: prompt }); const result = await chain.call({ input: 'Hello' });Then explain what you gain (streaming, batching, composability).
-
Design an LCEL pipeline that takes a user's question, classifies it (code/math/general), routes it to the appropriate specialized chain, and returns the result. Show the full pipeline structure.
-
Your production app uses GPT-4o but occasionally hits rate limits. Design a fallback strategy using LCEL that tries GPT-4o, then GPT-4o-mini, then Claude, logging which model was ultimately used.
Navigation: <- 4.17.d Working with Agents | 4.17 Overview ->