Episode 4 — Generative AI Engineering / 4.17 — LangChain Practical

4.17.e — LCEL Overview

In one sentence: LCEL (LangChain Expression Language) is the declarative composition layer that lets you pipe Runnables together with prompt | model | parser syntax — giving you streaming, batching, parallel execution, branching, and fallbacks for free, and is the reason LangChain deprecated all legacy chain classes.

Navigation: <- 4.17.d Working with Agents | 4.17 Overview ->

1. What Is LCEL?

LCEL (LangChain Expression Language) is LangChain's composition system. It defines how components connect together. Every component in LangChain implements the Runnable interface, and LCEL provides the operators and utilities to wire Runnables into complex pipelines.

The core idea

Any Runnable can be piped to any other Runnable.
The output of one becomes the input of the next.

prompt.pipe(model).pipe(parser)

This creates a new Runnable that:
  1. Takes the prompt's input type (a dictionary of variables)
  2. Produces the parser's output type (a string, JSON object, etc.)
  3. Automatically supports .invoke(), .stream(), .batch()

Before LCEL (legacy chains)

// LEGACY — class-based, rigid, hard to compose
import { LLMChain } from 'langchain/chains';
import { SequentialChain } from 'langchain/chains';

const chain1 = new LLMChain({ llm: model, prompt: prompt1, outputKey: 'summary' });
const chain2 = new LLMChain({ llm: model, prompt: prompt2, outputKey: 'translation' });

const sequential = new SequentialChain({
  chains: [chain1, chain2],
  inputVariables: ['text'],
  outputVariables: ['translation']
});

After LCEL (modern approach)

// MODERN — pipe-based, flexible, composable
import { RunnableSequence } from '@langchain/core/runnables';

const summarize = prompt1.pipe(model).pipe(parser);
const translate = prompt2.pipe(model).pipe(parser);

const pipeline = RunnableSequence.from([
  { summary: summarize },
  (input) => ({ text: input.summary, language: 'French' }),
  translate
]);

2. The Pipe Operator

The .pipe() method is the fundamental building block. It connects two Runnables in sequence: the output of the first flows into the input of the second.

Basic piping

import { ChatOpenAI } from '@langchain/openai';
import { ChatPromptTemplate } from '@langchain/core/prompts';
import { StringOutputParser } from '@langchain/core/output_parsers';

const prompt = ChatPromptTemplate.fromMessages([
  ['system', 'You are a helpful assistant.'],
  ['user', '{input}']
]);

const model = new ChatOpenAI({ modelName: 'gpt-4o' });
const parser = new StringOutputParser();

// Pipe: prompt -> model -> parser
const chain = prompt.pipe(model).pipe(parser);

// What happens at each stage:
// 1. prompt.invoke({input: "Hello"}) 
//    → [SystemMessage("You are a helpful assistant."), HumanMessage("Hello")]
//
// 2. model.invoke(messages) 
//    → AIMessage({ content: "Hi! How can I help?" })
//
// 3. parser.invoke(aiMessage) 
//    → "Hi! How can I help?"

const result = await chain.invoke({ input: 'Hello' });
// "Hi! How can I help?"

Every pipe creates a new Runnable

const step1 = prompt.pipe(model);         // Runnable: {variables} -> AIMessage
const step2 = step1.pipe(parser);          // Runnable: {variables} -> string
const step3 = step2.pipe(someTransform);   // Runnable: {variables} -> transformed output

// step2 is independent of step3
const result2 = await step2.invoke({ input: 'Hello' }); // Still works

The Runnable interface

Every component in LCEL implements these methods:

// The Runnable Protocol
interface Runnable<Input, Output> {
  invoke(input: Input): Promise<Output>;           // Single input -> single output
  stream(input: Input): AsyncGenerator<Output>;    // Single input -> streamed output
  batch(inputs: Input[]): Promise<Output[]>;       // Multiple inputs -> multiple outputs
  pipe(next: Runnable): Runnable;                  // Connect to next Runnable
}

// Because every pipe creates a Runnable, the CHAIN itself supports all these methods:
const chain = prompt.pipe(model).pipe(parser);

await chain.invoke({ input: 'Hello' });                    // Single call
await chain.stream({ input: 'Hello' });                    // Streaming
await chain.batch([{ input: 'A' }, { input: 'B' }]);      // Batch

3. RunnableSequence

RunnableSequence is the explicit version of piping. It takes an array of steps and runs them in order. This is useful when you need to build chains dynamically or insert transformation functions between components.

import { RunnableSequence } from '@langchain/core/runnables';
import { ChatOpenAI } from '@langchain/openai';
import { ChatPromptTemplate } from '@langchain/core/prompts';
import { StringOutputParser } from '@langchain/core/output_parsers';

const model = new ChatOpenAI({ modelName: 'gpt-4o', temperature: 0 });
const parser = new StringOutputParser();

const chain = RunnableSequence.from([
  // Step 1: Format the prompt
  ChatPromptTemplate.fromMessages([
    ['system', 'Extract all proper nouns from the text.'],
    ['user', '{text}']
  ]),
  // Step 2: Call the model
  model,
  // Step 3: Parse to string
  parser,
  // Step 4: Transform the output (plain function)
  (output) => ({
    nouns: output.split(',').map(n => n.trim()),
    count: output.split(',').length
  })
]);

const result = await chain.invoke({
  text: 'Barack Obama met with Angela Merkel in Berlin to discuss NATO.'
});

console.log(result);
// { nouns: ["Barack Obama", "Angela Merkel", "Berlin", "NATO"], count: 4 }

Inline functions in sequences

Plain functions are automatically wrapped as RunnableLambda inside a RunnableSequence:

const chain = RunnableSequence.from([
  // Function: preprocess input
  (input) => ({ ...input, text: input.text.toLowerCase().trim() }),

  // Prompt template
  ChatPromptTemplate.fromMessages([
    ['system', 'Classify the sentiment of this text as positive, negative, or neutral.'],
    ['user', '{text}']
  ]),

  // Model
  model,

  // Parser
  parser,

  // Function: postprocess output
  (output) => ({
    sentiment: output.trim().toLowerCase(),
    timestamp: new Date().toISOString()
  })
]);

4. RunnableParallel

RunnableParallel runs multiple Runnables simultaneously on the same input and returns all results as a dictionary. This is extremely useful when you need to extract multiple pieces of information from the same input.

import { RunnableParallel } from '@langchain/core/runnables';
import { ChatOpenAI } from '@langchain/openai';
import { ChatPromptTemplate } from '@langchain/core/prompts';
import { StringOutputParser } from '@langchain/core/output_parsers';

const model = new ChatOpenAI({ modelName: 'gpt-4o', temperature: 0 });
const parser = new StringOutputParser();

// Three different analyses run in parallel
const sentimentChain = ChatPromptTemplate.fromMessages([
  ['system', 'Classify sentiment as positive, negative, or neutral. Reply with one word.'],
  ['user', '{text}']
]).pipe(model).pipe(parser);

const topicChain = ChatPromptTemplate.fromMessages([
  ['system', 'Extract the main topic in 3 words or fewer.'],
  ['user', '{text}']
]).pipe(model).pipe(parser);

const languageChain = ChatPromptTemplate.fromMessages([
  ['system', 'Detect the language. Reply with the language name only.'],
  ['user', '{text}']
]).pipe(model).pipe(parser);

// Run all three in parallel
const analysisChain = RunnableParallel.from({
  sentiment: sentimentChain,
  topic: topicChain,
  language: languageChain
});

const result = await analysisChain.invoke({
  text: 'LangChain makes it really easy to build AI applications. I love it!'
});

console.log(result);
// {
//   sentiment: "positive",
//   topic: "AI development tools",
//   language: "English"
// }

Parallel + Sequential combined

import { RunnableSequence, RunnableParallel } from '@langchain/core/runnables';

const pipeline = RunnableSequence.from([
  // Step 1: Run analyses in parallel
  RunnableParallel.from({
    sentiment: sentimentChain,
    topic: topicChain,
    language: languageChain,
    originalText: (input) => input.text  // Pass through
  }),

  // Step 2: Use the parallel results to generate a report
  (results) => ({
    report: `Analysis Report:
- Text: "${results.originalText.substring(0, 50)}..."
- Sentiment: ${results.sentiment}
- Topic: ${results.topic}
- Language: ${results.language}`
  })
]);

const report = await pipeline.invoke({
  text: 'LangChain makes building AI applications really straightforward.'
});

console.log(report);
// {
//   report: "Analysis Report:
//     - Text: \"LangChain makes building AI applications really...\"
//     - Sentiment: positive
//     - Topic: AI development
//     - Language: English"
// }

Using object syntax (shorthand for RunnableParallel)

When you pass an object as a step in RunnableSequence.from(), it is automatically treated as a RunnableParallel:

const chain = RunnableSequence.from([
  // This object is automatically a RunnableParallel
  {
    summary: summarizeChain,
    keywords: keywordChain,
    wordCount: (input) => input.text.split(' ').length
  },
  // Next step receives { summary, keywords, wordCount }
  formatResultChain
]);

5. RunnableLambda

RunnableLambda wraps a plain function as a Runnable, giving it the full Runnable interface (invoke, stream, batch, pipe).

import { RunnableLambda } from '@langchain/core/runnables';

// Wrap a function as a Runnable
const upperCase = RunnableLambda.from((input) => input.toUpperCase());
const addTimestamp = RunnableLambda.from((input) => ({
  text: input,
  processedAt: new Date().toISOString()
}));

// Now it's pipeable
const chain = prompt.pipe(model).pipe(parser).pipe(upperCase).pipe(addTimestamp);

const result = await chain.invoke({ input: 'hello' });
// { text: "HI! HOW CAN I HELP YOU?", processedAt: "2025-04-11T10:30:00.000Z" }

Async functions

const fetchData = RunnableLambda.from(async (input) => {
  const response = await fetch(`https://api.example.com/data/${input.id}`);
  const data = await response.json();
  return { ...input, data };
});

const chain = fetchData.pipe(prompt).pipe(model).pipe(parser);

Error handling in lambdas

const safeParse = RunnableLambda.from((input) => {
  try {
    return JSON.parse(input);
  } catch (error) {
    return { error: `Failed to parse JSON: ${error.message}`, raw: input };
  }
});

6. Streaming with LCEL

One of LCEL's biggest advantages: streaming works automatically through the entire chain. When you call .stream(), every component that supports streaming passes chunks through without waiting for the full output.

Basic streaming

import { ChatOpenAI } from '@langchain/openai';
import { ChatPromptTemplate } from '@langchain/core/prompts';
import { StringOutputParser } from '@langchain/core/output_parsers';

const chain = ChatPromptTemplate.fromMessages([
  ['system', 'You are a storyteller.'],
  ['user', 'Tell a short story about {topic}.']
]).pipe(
  new ChatOpenAI({ modelName: 'gpt-4o', temperature: 0.8 })
).pipe(
  new StringOutputParser()
);

// Stream token-by-token
const stream = await chain.stream({ topic: 'a robot learning to cook' });

for await (const chunk of stream) {
  process.stdout.write(chunk);
  // Prints: "Once" "upon" "a" "time" "," "in" "a" "small" ...
}

Streaming with HTTP (Express example)

import express from 'express';

const app = express();

app.post('/api/chat', async (req, res) => {
  const { message } = req.body;

  // Set headers for Server-Sent Events
  res.setHeader('Content-Type', 'text/event-stream');
  res.setHeader('Cache-Control', 'no-cache');
  res.setHeader('Connection', 'keep-alive');

  const stream = await chain.stream({ input: message });

  for await (const chunk of stream) {
    res.write(`data: ${JSON.stringify({ token: chunk })}\n\n`);
  }

  res.write('data: [DONE]\n\n');
  res.end();
});

Streaming through complex chains

import { RunnableSequence, RunnableLambda } from '@langchain/core/runnables';

// Streaming works through the entire sequence
const chain = RunnableSequence.from([
  ChatPromptTemplate.fromMessages([
    ['system', 'You are a translator.'],
    ['user', 'Translate to {language}: {text}']
  ]),
  new ChatOpenAI({ modelName: 'gpt-4o', streaming: true }),
  new StringOutputParser()
]);

// The stream flows through: prompt (instant) -> model (streams tokens) -> parser (passes through)
const stream = await chain.stream({
  language: 'Spanish',
  text: 'Hello, how are you?'
});

let fullOutput = '';
for await (const chunk of stream) {
  fullOutput += chunk;
  process.stdout.write(chunk);
}
// fullOutput: "Hola, como estas?"

Stream events (detailed streaming)

For more granular control, streamEvents gives you events for every step in the chain:

const chain = prompt.pipe(model).pipe(parser);

const eventStream = chain.streamEvents(
  { input: 'Tell me about JavaScript' },
  { version: 'v2' }
);

for await (const event of eventStream) {
  if (event.event === 'on_llm_stream') {
    // Token from the LLM
    process.stdout.write(event.data.chunk.content || '');
  } else if (event.event === 'on_chain_start') {
    console.log(`\n[Chain started: ${event.name}]`);
  } else if (event.event === 'on_chain_end') {
    console.log(`\n[Chain ended: ${event.name}]`);
  }
}

7. Branching and Routing

LCEL supports dynamic routing — sending input to different chains based on conditions.

RunnableBranch

import { RunnableBranch, RunnableLambda } from '@langchain/core/runnables';
import { ChatOpenAI } from '@langchain/openai';
import { ChatPromptTemplate } from '@langchain/core/prompts';
import { StringOutputParser } from '@langchain/core/output_parsers';

const model = new ChatOpenAI({ modelName: 'gpt-4o', temperature: 0 });
const parser = new StringOutputParser();

// Different chains for different input types
const codeChain = ChatPromptTemplate.fromMessages([
  ['system', 'You are a code expert. Explain the code step by step.'],
  ['user', '{input}']
]).pipe(model).pipe(parser);

const mathChain = ChatPromptTemplate.fromMessages([
  ['system', 'You are a math tutor. Solve the problem step by step.'],
  ['user', '{input}']
]).pipe(model).pipe(parser);

const generalChain = ChatPromptTemplate.fromMessages([
  ['system', 'You are a helpful assistant.'],
  ['user', '{input}']
]).pipe(model).pipe(parser);

// Route based on input classification
const classifyChain = ChatPromptTemplate.fromMessages([
  ['system', 'Classify the input as "code", "math", or "general". Reply with one word only.'],
  ['user', '{input}']
]).pipe(model).pipe(parser);

// Build the router
const router = RunnableSequence.from([
  {
    category: classifyChain,
    input: (input) => input.input
  },
  new RunnableBranch(
    [
      (x) => x.category.trim().toLowerCase() === 'code',
      (x) => codeChain.invoke({ input: x.input })
    ],
    [
      (x) => x.category.trim().toLowerCase() === 'math',
      (x) => mathChain.invoke({ input: x.input })
    ],
    // Default branch
    (x) => generalChain.invoke({ input: x.input })
  )
]);

// Test routing
const r1 = await router.invoke({ input: 'What does Array.map() do in JavaScript?' });
// Routes to codeChain

const r2 = await router.invoke({ input: 'What is the derivative of x^2 + 3x?' });
// Routes to mathChain

const r3 = await router.invoke({ input: 'What is the capital of France?' });
// Routes to generalChain

Custom routing with RunnableLambda

import { RunnableLambda } from '@langchain/core/runnables';

const routeByLanguage = RunnableLambda.from(async (input) => {
  const languageMap = {
    javascript: jsExpertChain,
    python: pythonExpertChain,
    rust: rustExpertChain
  };

  const language = input.language?.toLowerCase() || 'javascript';
  const chain = languageMap[language] || generalChain;

  return chain.invoke({ input: input.question });
});

const result = await routeByLanguage.invoke({
  language: 'python',
  question: 'How do decorators work?'
});

8. Fallbacks

LCEL supports fallbacks — if one Runnable fails, automatically try another. This is essential for production reliability.

import { ChatOpenAI } from '@langchain/openai';
import { ChatAnthropic } from '@langchain/anthropic';
import { ChatPromptTemplate } from '@langchain/core/prompts';
import { StringOutputParser } from '@langchain/core/output_parsers';

const parser = new StringOutputParser();

// Primary model
const primaryModel = new ChatOpenAI({ modelName: 'gpt-4o', temperature: 0 });

// Fallback model (different provider!)
const fallbackModel = new ChatAnthropic({ modelName: 'claude-sonnet-4-20250514', temperature: 0 });

// Create model with fallback
const reliableModel = primaryModel.withFallbacks({
  fallbacks: [fallbackModel]
});

// If OpenAI is down, automatically falls back to Claude
const chain = ChatPromptTemplate.fromMessages([
  ['system', 'You are a helpful assistant.'],
  ['user', '{input}']
]).pipe(reliableModel).pipe(parser);

const result = await chain.invoke({ input: 'Hello!' });
// Tries OpenAI first. If it fails (rate limit, outage, error),
// automatically retries with Claude.

Multi-level fallbacks

const model = new ChatOpenAI({ modelName: 'gpt-4o' })
  .withFallbacks({
    fallbacks: [
      new ChatOpenAI({ modelName: 'gpt-4o-mini' }),        // Try cheaper model
      new ChatAnthropic({ modelName: 'claude-sonnet-4-20250514' }), // Try different provider
    ]
  });

// Fallback chain: gpt-4o -> gpt-4o-mini -> claude-sonnet

Chain-level fallbacks

const primaryChain = prompt1.pipe(model1).pipe(parser);
const fallbackChain = prompt2.pipe(model2).pipe(parser);

const reliableChain = primaryChain.withFallbacks({
  fallbacks: [fallbackChain]
});

9. Why LCEL Replaced Legacy Chain Classes

Feature	Legacy Chains	LCEL
Composition	Fixed set of chain types (LLMChain, SequentialChain, etc.)	Any Runnable composes with any other Runnable
Streaming	Not supported or requires custom implementation	Built-in on every chain via `.stream()`
Batching	Manual loops with Promise.all	Built-in via `.batch()` with concurrency control
Parallel execution	Not natively supported	`RunnableParallel` runs branches simultaneously
Branching	Requires custom code	`RunnableBranch` built-in
Fallbacks	Manual try/catch	`.withFallbacks()` built-in
Type flow	Loose (dict in, dict out)	Input/output types flow through the chain
Observability	Limited	Every step visible in LangSmith traces
Testability	Test the whole chain	Test each component independently

The key insight

Legacy chains were classes — each chain type was a specific class with specific behavior. If your use case didn't fit a predefined chain type, you were stuck.

LCEL is a protocol — any component that implements the Runnable interface can be piped to any other component. This means you can build any pipeline shape without being limited to predefined patterns.

10. Common LCEL Patterns

Pattern 1: Passthrough (keep original input)

import { RunnablePassthrough } from '@langchain/core/runnables';

// Include original input alongside model output
const chain = RunnableSequence.from([
  {
    original: new RunnablePassthrough(),                // Pass input through unchanged
    processed: prompt.pipe(model).pipe(parser)          // Process with LLM
  },
  (result) => ({
    input: result.original.text,
    output: result.processed,
    processedAt: new Date().toISOString()
  })
]);

Pattern 2: Assign (add new keys to input)

import { RunnablePassthrough } from '@langchain/core/runnables';

// RunnablePassthrough.assign() adds new keys while keeping existing ones
const chain = RunnablePassthrough.assign({
  // Add a 'wordCount' key to the input
  wordCount: (input) => input.text.split(' ').length,
  // Add a 'summary' key from LLM
  summary: summarizeChain
}).pipe(
  // Next step receives { text: "...", wordCount: 42, summary: "..." }
  formatChain
);

Pattern 3: Dynamic chain selection

const modelMap = {
  fast: new ChatOpenAI({ modelName: 'gpt-4o-mini' }),
  quality: new ChatOpenAI({ modelName: 'gpt-4o' }),
  creative: new ChatOpenAI({ modelName: 'gpt-4o', temperature: 1.2 })
};

const dynamicChain = RunnableLambda.from(async (input) => {
  const model = modelMap[input.mode] || modelMap.fast;
  const chain = prompt.pipe(model).pipe(parser);
  return chain.invoke({ input: input.text });
});

await dynamicChain.invoke({ mode: 'quality', text: 'Explain quantum computing' });
await dynamicChain.invoke({ mode: 'fast', text: 'What is 2+2?' });

Pattern 4: Retry with exponential backoff

import { ChatOpenAI } from '@langchain/openai';

const model = new ChatOpenAI({
  modelName: 'gpt-4o',
  maxRetries: 3  // Built-in retry on transient errors
});

// For custom retry logic on the chain level
const chain = prompt.pipe(model).pipe(parser).withRetry({
  stopAfterAttempt: 3
});

11. Key Takeaways

LCEL is the composition layer — it defines how components connect via the .pipe() operator and the Runnable protocol.
Every Runnable supports invoke, stream, batch — building a chain with LCEL gives you all three execution modes automatically.
RunnableSequence runs steps in order, RunnableParallel runs steps simultaneously, RunnableLambda wraps plain functions.
Streaming works through the entire chain — call .stream() on any LCEL chain and tokens flow through from the model to the consumer without buffering.
RunnableBranch enables dynamic routing — send input to different chains based on conditions (content type, language, complexity).
Fallbacks provide production reliability — .withFallbacks() automatically tries alternative models or chains when the primary fails.
LCEL replaced legacy chains because it is composable (any Runnable works), streamable, batchable, and observable by default — no special handling needed.
Use the object shorthand for parallel execution — passing { key1: chain1, key2: chain2 } as a step in a sequence automatically runs the chains in parallel.

Explain-It Challenge

Rewrite this legacy code using LCEL:

const chain = new LLMChain({ llm: model, prompt: prompt });
const result = await chain.call({ input: 'Hello' });

Then explain what you gain (streaming, batching, composability).

Design an LCEL pipeline that takes a user's question, classifies it (code/math/general), routes it to the appropriate specialized chain, and returns the result. Show the full pipeline structure.
Your production app uses GPT-4o but occasionally hits rate limits. Design a fallback strategy using LCEL that tries GPT-4o, then GPT-4o-mini, then Claude, logging which model was ultimately used.

Navigation: <- 4.17.d Working with Agents | 4.17 Overview ->