Episode 4 — Generative AI Engineering / 4.16 — Agent Design Patterns

4.16.d — Router Agents

In one sentence: The Router pattern uses a single routing agent to classify user intent and dispatch each request to the most appropriate specialized agent or handler — enabling multi-capability AI systems where different types of queries are handled by experts, with fallback routing for anything unrecognized.

Navigation: ← 4.16.c Critic-Refiner · 4.17 — LangChain Practical →


1. What Is the Router Pattern?

The Router pattern places a lightweight routing agent at the front of your system. This agent:

  1. Analyzes the user's message to determine intent (what kind of request is this?)
  2. Classifies the intent into a known category
  3. Dispatches the request to a specialized agent or handler
  4. Handles fallbacks when no specialist matches

This is the AI equivalent of a telephone switchboard operator — the Router doesn't answer the question itself; it connects you to the right person.

┌────────────────────────────────────────────────────────────────┐
│                     ROUTER PATTERN                             │
│                                                                │
│  User Message ──────────────────────────────────────────►      │
│       │                                                        │
│       ▼                                                        │
│  ┌──────────────────────────────────┐                          │
│  │        ROUTER AGENT              │                          │
│  │                                  │                          │
│  │  "Classify this message.         │                          │
│  │   What type of request is it?"   │                          │
│  │                                  │                          │
│  │  Intent: code_help               │                          │
│  └──────────┬───────────────────────┘                          │
│             │                                                  │
│             ├── intent: "code_help"                             │
│             │   ┌──────────────┐                               │
│             ├──►│  Code Agent   │ "You are an expert coder..." │
│             │   └──────────────┘                               │
│             │                                                  │
│             ├── intent: "data_analysis"                         │
│             │   ┌──────────────┐                               │
│             ├──►│  Data Agent   │ "You analyze datasets..."    │
│             │   └──────────────┘                               │
│             │                                                  │
│             ├── intent: "creative_writing"                      │
│             │   ┌──────────────┐                               │
│             ├──►│  Writing Agent│ "You are a creative writer." │
│             │   └──────────────┘                               │
│             │                                                  │
│             ├── intent: "math"                                  │
│             │   ┌──────────────┐                               │
│             ├──►│  Math Agent   │ "You are a math solver..."   │
│             │   └──────────────┘                               │
│             │                                                  │
│             └── intent: "unknown"                               │
│                 ┌──────────────┐                               │
│                 │ Fallback     │ "You are a general assistant" │
│                 └──────────────┘                               │
└────────────────────────────────────────────────────────────────┘

2. When to Use the Router Pattern

Use This Pattern WhenDo NOT Use When
Your system handles multiple distinct task typesYou only handle one type of task
Different tasks need different system promptsA single system prompt covers all tasks
Different tasks need different toolsAll tasks use the same tool set
Different tasks need different models or temperaturesSame model config works for everything
You want specialized quality per task typeGeneral quality is acceptable
You need to log and monitor by task categoryYou don't need per-category analytics

Real-world examples

  • Customer support bot — routes to billing, technical support, sales, or general FAQ agents
  • Development assistant — routes to code generation, debugging, documentation, or architecture agents
  • Enterprise search — routes to structured data queries, document search, or web search
  • Multi-modal AI — routes to text, image, audio, or video processing pipelines
  • API gateway — routes requests to different backend services based on intent

3. Intent Classification as Routing

The core of the Router is intent classification — determining what the user wants. This can be done with:

Approach 1: LLM-based classification

The most flexible approach. An LLM reads the message and classifies it.

const ROUTER_SYSTEM_PROMPT = `You are a Request Router. Your ONLY job is to classify the user's message into one of the following categories:

CATEGORIES:
- code_help: Questions about writing, debugging, or understanding code
- data_analysis: Requests to analyze data, create charts, compute statistics
- creative_writing: Requests for creative content (stories, poems, marketing copy)
- math: Mathematical calculations, equations, proofs
- search: Requests that need web search or current information
- summarize: Requests to summarize documents, articles, or text
- general: General knowledge questions or conversations

RULES:
1. Return ONLY a JSON object with the classification.
2. If the message could fit multiple categories, pick the PRIMARY intent.
3. If truly ambiguous, return "general".
4. Be decisive — do not hedge.

OUTPUT FORMAT:
{
  "intent": "<category>",
  "confidence": <0.0-1.0>,
  "reasoning": "Brief explanation of why this category"
}`;

Approach 2: Keyword-based routing (fast, no LLM call)

For high-volume systems where latency matters, use simple keyword matching first and fall back to LLM classification only when keywords are ambiguous.

function keywordRoute(message) {
  const lowerMessage = message.toLowerCase();

  const patterns = [
    {
      intent: 'code_help',
      keywords: ['code', 'function', 'bug', 'error', 'debug', 'javascript', 'python', 'api', 'class', 'import', 'npm', 'git'],
    },
    {
      intent: 'data_analysis',
      keywords: ['data', 'csv', 'chart', 'analyze', 'statistics', 'average', 'trend', 'dataset', 'graph', 'plot'],
    },
    {
      intent: 'creative_writing',
      keywords: ['write a story', 'poem', 'creative', 'blog post', 'marketing', 'tagline', 'slogan', 'fiction'],
    },
    {
      intent: 'math',
      keywords: ['calculate', 'equation', 'solve', 'integral', 'derivative', 'probability', 'formula'],
    },
    {
      intent: 'summarize',
      keywords: ['summarize', 'summary', 'tldr', 'key points', 'brief', 'condense'],
    },
  ];

  for (const pattern of patterns) {
    const matchCount = pattern.keywords.filter((kw) => lowerMessage.includes(kw)).length;
    if (matchCount >= 2) {
      return { intent: pattern.intent, confidence: Math.min(0.6 + matchCount * 0.1, 0.95), method: 'keyword' };
    }
  }

  return { intent: 'general', confidence: 0.5, method: 'keyword_fallback' };
}

Approach 3: Hybrid routing (keyword first, LLM fallback)

async function hybridRoute(message) {
  // Try fast keyword routing first
  const keywordResult = keywordRoute(message);

  // If confidence is high enough, use keyword result
  if (keywordResult.confidence >= 0.8) {
    console.log(`  Routed by keywords: ${keywordResult.intent} (${keywordResult.confidence})`);
    return keywordResult;
  }

  // Otherwise, fall back to LLM classification
  console.log('  Keyword routing uncertain, using LLM classification...');
  return await llmRoute(message);
}

4. Specialized Agents (Handlers)

Each route points to a specialized agent with its own system prompt, tools, and configuration.

const SPECIALIZED_AGENTS = {
  code_help: {
    systemPrompt: `You are an expert software engineer. Help the user with coding tasks.
- Write clean, well-commented code
- Explain your reasoning
- Follow best practices for the language in question
- Include error handling
- Suggest tests when appropriate`,
    model: 'gpt-4o',
    temperature: 0.2,
    tools: ['run_code', 'search_docs', 'lint_code'],
  },

  data_analysis: {
    systemPrompt: `You are a data analyst. Help the user analyze data and extract insights.
- Ask clarifying questions about the data if needed
- Compute relevant statistics
- Create clear visualizations (describe them if you cannot render)
- Explain findings in plain language
- Note any limitations or caveats in the analysis`,
    model: 'gpt-4o',
    temperature: 0,
    tools: ['load_csv', 'calculate_stats', 'generate_chart'],
  },

  creative_writing: {
    systemPrompt: `You are a creative writer. Produce engaging, original content.
- Match the requested tone and style
- Use vivid language and varied sentence structure
- Be creative but stay on topic
- Adapt to the target audience`,
    model: 'gpt-4o',
    temperature: 0.9,
    tools: [],
  },

  math: {
    systemPrompt: `You are a math expert. Solve problems step by step.
- Show all work clearly
- Explain each step
- Use proper mathematical notation
- Verify your answer
- Mention alternative approaches if they exist`,
    model: 'gpt-4o',
    temperature: 0,
    tools: ['calculator', 'wolfram_alpha'],
  },

  search: {
    systemPrompt: `You are a search and research assistant. Find current, accurate information.
- Search multiple sources
- Cite your sources
- Distinguish between facts and opinions
- Note when information may be outdated`,
    model: 'gpt-4o',
    temperature: 0.3,
    tools: ['web_search', 'news_search'],
  },

  summarize: {
    systemPrompt: `You are a summarization expert. Produce clear, accurate summaries.
- Capture the most important points
- Preserve the original meaning
- Use bullet points for clarity
- Note anything omitted that might be important
- Adjust length to the user's request`,
    model: 'gpt-4o',
    temperature: 0.1,
    tools: [],
  },

  general: {
    systemPrompt: `You are a helpful general assistant. Answer the user's question to the best of your ability. If you are unsure, say so.`,
    model: 'gpt-4o',
    temperature: 0.5,
    tools: [],
  },
};

Notice how each agent has a different temperature, different tools, and a different system prompt. The code agent uses temperature 0.2 for precise code, while the creative writing agent uses 0.9 for expressive prose. This specialization is the key advantage of the Router pattern.


5. Fallback Routing

Fallback handling is critical. When the Router cannot confidently classify a request, it should degrade gracefully.

Three-tier fallback strategy

async function routeWithFallback(message) {
  const classification = await hybridRoute(message);

  // Tier 1: High confidence — route to specialist
  if (classification.confidence >= 0.8) {
    console.log(`Tier 1: Routing to ${classification.intent} (confidence: ${classification.confidence})`);
    return SPECIALIZED_AGENTS[classification.intent];
  }

  // Tier 2: Medium confidence — route to specialist but add a disclaimer
  if (classification.confidence >= 0.5) {
    console.log(`Tier 2: Routing to ${classification.intent} with disclaimer (confidence: ${classification.confidence})`);
    const agent = { ...SPECIALIZED_AGENTS[classification.intent] };
    agent.systemPrompt += '\n\nNote: The routing was uncertain. If this request seems outside your specialty, say so and suggest the user rephrase.';
    return agent;
  }

  // Tier 3: Low confidence — use general agent
  console.log(`Tier 3: Fallback to general agent (confidence: ${classification.confidence})`);
  return SPECIALIZED_AGENTS.general;
}

Asking for clarification

When confidence is very low, the Router can ask the user for clarification instead of guessing:

async function routeOrClarify(message) {
  const classification = await hybridRoute(message);

  if (classification.confidence < 0.4) {
    return {
      action: 'clarify',
      response: `I want to help, but I'm not sure what type of assistance you need. Could you clarify? I can help with:
- **Code** — writing, debugging, reviewing code
- **Data analysis** — analyzing datasets, creating charts
- **Creative writing** — stories, blog posts, marketing copy
- **Math** — calculations, equations, proofs
- **Search** — finding current information
- **Summarization** — condensing long text

Which of these best fits your request?`,
    };
  }

  return {
    action: 'route',
    agent: SPECIALIZED_AGENTS[classification.intent],
    classification,
  };
}

6. Full Implementation: Router System

import OpenAI from 'openai';

const openai = new OpenAI();

// ─────────────────────────────────────────────────────────
// ROUTER AGENT
// ─────────────────────────────────────────────────────────

const ROUTER_PROMPT = `You are a Request Router. Classify the user's message into exactly one category.

CATEGORIES:
- code_help: Writing, debugging, or understanding code
- data_analysis: Analyzing data, charts, statistics
- creative_writing: Creative content (stories, poems, marketing)
- math: Mathematical calculations or proofs
- search: Questions needing current/external information
- summarize: Summarizing text or documents
- general: Anything else

Return JSON: { "intent": "...", "confidence": 0.0-1.0, "reasoning": "..." }`;

async function classifyIntent(message) {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o-mini', // Use a smaller, faster model for routing
    temperature: 0,
    response_format: { type: 'json_object' },
    messages: [
      { role: 'system', content: ROUTER_PROMPT },
      { role: 'user', content: message },
    ],
  });

  return JSON.parse(response.choices[0].message.content);
}

// ─────────────────────────────────────────────────────────
// SPECIALIZED HANDLERS
// ─────────────────────────────────────────────────────────

async function handleCodeHelp(message) {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    temperature: 0.2,
    messages: [
      {
        role: 'system',
        content: 'You are an expert software engineer. Write clean, well-commented code. Explain your reasoning. Include error handling and suggest tests.',
      },
      { role: 'user', content: message },
    ],
  });
  return response.choices[0].message.content;
}

async function handleDataAnalysis(message) {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    temperature: 0,
    messages: [
      {
        role: 'system',
        content: 'You are a data analyst. Compute statistics, explain findings clearly, note limitations. Use tables and structured output.',
      },
      { role: 'user', content: message },
    ],
  });
  return response.choices[0].message.content;
}

async function handleCreativeWriting(message) {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    temperature: 0.9,
    messages: [
      {
        role: 'system',
        content: 'You are a creative writer. Produce engaging, original content with vivid language and varied structure.',
      },
      { role: 'user', content: message },
    ],
  });
  return response.choices[0].message.content;
}

async function handleMath(message) {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    temperature: 0,
    messages: [
      {
        role: 'system',
        content: 'You are a math expert. Solve problems step by step. Show all work. Verify your answer.',
      },
      { role: 'user', content: message },
    ],
  });
  return response.choices[0].message.content;
}

async function handleSearch(message) {
  // In production, this would call a search API first
  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    temperature: 0.3,
    messages: [
      {
        role: 'system',
        content: 'You are a research assistant. Provide accurate, well-sourced information. Clearly note when information may not be current.',
      },
      { role: 'user', content: message },
    ],
  });
  return response.choices[0].message.content;
}

async function handleSummarize(message) {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    temperature: 0.1,
    messages: [
      {
        role: 'system',
        content: 'You are a summarization expert. Produce clear, accurate summaries using bullet points. Capture the most important information.',
      },
      { role: 'user', content: message },
    ],
  });
  return response.choices[0].message.content;
}

async function handleGeneral(message) {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    temperature: 0.5,
    messages: [
      {
        role: 'system',
        content: 'You are a helpful general assistant. Answer clearly and honestly. If unsure, say so.',
      },
      { role: 'user', content: message },
    ],
  });
  return response.choices[0].message.content;
}

// ─────────────────────────────────────────────────────────
// HANDLER REGISTRY
// ─────────────────────────────────────────────────────────

const handlers = {
  code_help: handleCodeHelp,
  data_analysis: handleDataAnalysis,
  creative_writing: handleCreativeWriting,
  math: handleMath,
  search: handleSearch,
  summarize: handleSummarize,
  general: handleGeneral,
};

// ─────────────────────────────────────────────────────────
// MAIN ROUTER
// ─────────────────────────────────────────────────────────

async function routeRequest(userMessage) {
  console.log('========================================');
  console.log('  ROUTER SYSTEM');
  console.log('========================================');
  console.log(`User: "${userMessage}"\n`);

  // Step 1: Classify intent
  console.log('--- Classification ---');
  const classification = await classifyIntent(userMessage);
  console.log(`  Intent: ${classification.intent}`);
  console.log(`  Confidence: ${classification.confidence}`);
  console.log(`  Reasoning: ${classification.reasoning}`);

  // Step 2: Select handler
  const handler = handlers[classification.intent] || handlers.general;

  // Step 3: Execute
  console.log(`\n--- Dispatching to ${classification.intent} handler ---`);
  const result = await handler(userMessage);

  // Step 4: Return with metadata
  return {
    response: result,
    metadata: {
      intent: classification.intent,
      confidence: classification.confidence,
      handler: classification.intent,
      model: classification.intent === 'creative_writing' ? 'gpt-4o (temp 0.9)' : 'gpt-4o',
    },
  };
}

// ─────────────────────────────────────────────────────────
// EXAMPLES
// ─────────────────────────────────────────────────────────

// Test with different request types
const testMessages = [
  'Write a function that sorts an array of objects by a nested property',
  'What is the average temperature on Mars?',
  'Write me a haiku about JavaScript',
  'Calculate the compound interest on $10,000 at 5% over 10 years',
  'Summarize the key points of the Agile Manifesto',
  'Analyze this sales data: Jan $100k, Feb $120k, Mar $95k, Apr $150k',
];

for (const message of testMessages) {
  const result = await routeRequest(message);
  console.log(`\nResponse (first 200 chars): ${result.response.slice(0, 200)}...`);
  console.log(`Metadata: ${JSON.stringify(result.metadata)}`);
  console.log('\n' + '='.repeat(60) + '\n');
}

Expected output

========================================
  ROUTER SYSTEM
========================================
User: "Write a function that sorts an array of objects by a nested property"

--- Classification ---
  Intent: code_help
  Confidence: 0.97
  Reasoning: User is asking for help writing a specific function — a coding task

--- Dispatching to code_help handler ---

Response (first 200 chars): Here's a function that sorts an array of objects by a nested property...
Metadata: {"intent":"code_help","confidence":0.97,"handler":"code_help","model":"gpt-4o"}

============================================================

User: "Write me a haiku about JavaScript"

--- Classification ---
  Intent: creative_writing
  Confidence: 0.95
  Reasoning: User is requesting a creative piece (haiku) — a creative writing task

--- Dispatching to creative_writing handler ---
...

7. Architecture Diagram: Multi-Agent Router System

┌─────────────────────────────────────────────────────────────────────────────┐
│                    PRODUCTION ROUTER ARCHITECTURE                            │
│                                                                             │
│  ┌─────────────┐                                                            │
│  │  User Input  │                                                           │
│  └──────┬──────┘                                                            │
│         │                                                                   │
│         ▼                                                                   │
│  ┌──────────────────┐                                                       │
│  │  PRE-PROCESSING   │  Sanitize input, check rate limits,                  │
│  │                    │  detect language, check content policy               │
│  └────────┬─────────┘                                                       │
│           │                                                                 │
│           ▼                                                                 │
│  ┌──────────────────┐    ┌──────────────┐                                   │
│  │  KEYWORD ROUTER   │───►│ High confidence│──► Skip LLM router             │
│  │  (fast, cheap)    │    │ match?       │                                   │
│  └────────┬─────────┘    └──────┬───────┘                                   │
│           │ low confidence      │ no                                         │
│           ▼                     ▼                                            │
│  ┌──────────────────┐                                                       │
│  │  LLM ROUTER      │  gpt-4o-mini (fast, cheap model)                     │
│  │  (smart, slower)  │  temperature: 0                                      │
│  └────────┬─────────┘                                                       │
│           │                                                                 │
│           ▼                                                                 │
│  ┌──────────────────────────────────────────────────────────────────────┐   │
│  │                      HANDLER REGISTRY                                │   │
│  │                                                                      │   │
│  │  ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────┐       │   │
│  │  │ Code Agent │ │ Data Agent │ │ Write Agent│ │ Math Agent │ ...   │   │
│  │  │ temp: 0.2  │ │ temp: 0    │ │ temp: 0.9  │ │ temp: 0    │       │   │
│  │  │ tools: [.. │ │ tools: [.. │ │ tools: []  │ │ tools: [.. │       │   │
│  │  └─────┬──────┘ └─────┬──────┘ └─────┬──────┘ └─────┬──────┘       │   │
│  │        │              │              │              │                │   │
│  │        ▼              ▼              ▼              ▼                │   │
│  │  ┌──────────────────────────────────────────────────────────────┐   │   │
│  │  │                     RESPONSE                                 │   │   │
│  │  └──────────────────────────────────────────────────────────────┘   │   │
│  └──────────────────────────────────────────────────────────────────────┘   │
│           │                                                                 │
│           ▼                                                                 │
│  ┌──────────────────┐                                                       │
│  │  POST-PROCESSING  │  Format output, add citations,                       │
│  │                    │  log metrics, check safety                           │
│  └────────┬─────────┘                                                       │
│           │                                                                 │
│           ▼                                                                 │
│  ┌──────────────────┐                                                       │
│  │  User Response    │  + metadata: { intent, confidence, handler, latency } │
│  └──────────────────┘                                                       │
└─────────────────────────────────────────────────────────────────────────────┘

8. Comparison with Function Calling (4.7)

The Router pattern may seem similar to function calling (4.7), but they solve different problems.

AspectFunction Calling (4.7)Router Pattern (4.16.d)
What decidesThe LLM decides which tool to callThe Router classifies intent, then dispatches
GranularityIndividual tool calls (get_weather, search)Entire conversation flows (code agent, math agent)
System promptOne system prompt for all toolsDifferent system prompt per handler
TemperatureOne temperature settingDifferent temperature per handler
ModelOne model for all toolsCan use different models per handler
ScopeTool use within a single conversationRouting to entirely different agent configurations
When to useA single agent needs multiple toolsDifferent request types need different agent configs

They complement each other

Router Pattern                          Function Calling
      │                                       │
      ▼                                       ▼
┌──────────┐                           ┌──────────────┐
│ Router   │ classifies intent         │ Code Agent   │ uses tool calling
│          │─── "code_help" ──────────►│              │ to call run_code(),
│          │                           │              │ lint_code(), etc.
└──────────┘                           └──────────────┘

The Router decides WHICH agent handles the request.
Function calling decides WHICH tools the chosen agent uses.

9. Advanced: Dynamic Handler Registration

In a production system, you want to add new handlers without modifying the Router logic.

class RouterSystem {
  constructor() {
    this.handlers = new Map();
    this.routerPromptParts = [];
  }

  // Register a new handler dynamically
  registerHandler(intent, config) {
    this.handlers.set(intent, config);
    this.routerPromptParts.push(
      `- ${intent}: ${config.description}`
    );
    console.log(`Registered handler: ${intent}`);
  }

  // Build the router prompt dynamically from registered handlers
  buildRouterPrompt() {
    return `You are a Request Router. Classify the user's message.

CATEGORIES:
${this.routerPromptParts.join('\n')}
- general: Anything that does not match the above categories

Return JSON: { "intent": "...", "confidence": 0.0-1.0, "reasoning": "..." }`;
  }

  async route(message) {
    const prompt = this.buildRouterPrompt();

    const response = await openai.chat.completions.create({
      model: 'gpt-4o-mini',
      temperature: 0,
      response_format: { type: 'json_object' },
      messages: [
        { role: 'system', content: prompt },
        { role: 'user', content: message },
      ],
    });

    const classification = JSON.parse(response.choices[0].message.content);
    const handler = this.handlers.get(classification.intent);

    if (!handler) {
      console.log(`No handler for intent "${classification.intent}", using general fallback`);
      return this.handlers.get('general')?.handle(message) || 'I can help with that. Could you provide more details?';
    }

    return handler.handle(message);
  }
}

// Usage
const router = new RouterSystem();

router.registerHandler('code_help', {
  description: 'Writing, debugging, or understanding code',
  handle: async (msg) => handleCodeHelp(msg),
});

router.registerHandler('data_analysis', {
  description: 'Analyzing data, creating charts, computing statistics',
  handle: async (msg) => handleDataAnalysis(msg),
});

router.registerHandler('creative_writing', {
  description: 'Creative content like stories, poems, or marketing copy',
  handle: async (msg) => handleCreativeWriting(msg),
});

router.registerHandler('general', {
  description: 'General questions and conversations',
  handle: async (msg) => handleGeneral(msg),
});

// New handler added later — no changes to Router logic
router.registerHandler('sql_help', {
  description: 'SQL query writing, optimization, and database schema design',
  handle: async (msg) => {
    const response = await openai.chat.completions.create({
      model: 'gpt-4o',
      temperature: 0,
      messages: [
        { role: 'system', content: 'You are a SQL expert. Write efficient queries, explain execution plans, optimize performance.' },
        { role: 'user', content: msg },
      ],
    });
    return response.choices[0].message.content;
  },
});

const result = await router.route('Write a SQL query to find the top 10 customers by revenue');

10. Monitoring and Analytics

The Router pattern gives you a natural place to add per-intent metrics:

async function routeWithMetrics(message) {
  const startTime = Date.now();

  // Classify
  const classification = await classifyIntent(message);
  const classificationTime = Date.now() - startTime;

  // Execute handler
  const handlerStartTime = Date.now();
  const handler = handlers[classification.intent] || handlers.general;
  const result = await handler(message);
  const handlerTime = Date.now() - handlerStartTime;

  // Log metrics
  const metrics = {
    timestamp: new Date().toISOString(),
    intent: classification.intent,
    confidence: classification.confidence,
    classificationTimeMs: classificationTime,
    handlerTimeMs: handlerTime,
    totalTimeMs: Date.now() - startTime,
    inputLength: message.length,
    outputLength: result.length,
  };

  console.log('Metrics:', JSON.stringify(metrics));

  // In production: send to monitoring (DataDog, CloudWatch, etc.)
  // await sendToMonitoring(metrics);

  return { result, metrics };
}

This gives you dashboards showing:

  • Intent distribution — what percentage of requests are code, data, creative, etc.
  • Confidence distribution — are many requests falling to the general fallback?
  • Latency by intent — which handlers are slow?
  • Volume by intent — which handlers need scaling?

11. Key Takeaways

  1. The Router pattern enables specialization — each handler has its own system prompt, temperature, tools, and optionally model, producing better results than a one-size-fits-all agent.
  2. Intent classification is the core — use LLM classification for flexibility, keyword matching for speed, or a hybrid approach. Low-confidence classifications should fall back to a general handler or ask for clarification.
  3. Use a cheap, fast model for routing — the Router itself does not need GPT-4o. GPT-4o-mini (or similar) handles classification quickly and cheaply, saving the powerful model for the actual task.
  4. Dynamic handler registration makes the system extensible — adding a new capability means registering a new handler, not rewriting the Router.
  5. Router + function calling are complementary — the Router selects the agent; function calling selects the tools within that agent. They operate at different levels.
  6. Built-in metrics — the Router is a natural chokepoint for logging intent distribution, latency, and confidence, giving you per-capability observability.

Explain-It Challenge

  1. You have a chatbot that handles customer support. Currently it uses one system prompt for everything. Explain how the Router pattern would improve response quality, and design the intent categories.
  2. Your Router classifies "help me optimize my SQL query" as code_help instead of data_analysis. Both could work. How do you decide which is correct, and does it matter?
  3. Compare the Router pattern to a giant if/else chain that checks keywords. What are the trade-offs of each approach? When does the LLM-based Router justify its cost?

Navigation: ← 4.16.c Critic-Refiner · 4.17 — LangChain Practical →