Episode 4 — Generative AI Engineering / 4.8 — Streaming Responses

4.8.b — Progressive Rendering

In one sentence: Progressive rendering takes the raw stream of tokens from the LLM API and displays them in real-time in your UI — turning a blank screen into a responsive, "typing" experience using React state management, streaming patterns, and careful handling of markdown and formatting.

Navigation: <- 4.8.a — Streaming Tokens | 4.8.c — Improving UX in AI Applications ->


1. What Is Progressive Rendering?

Progressive rendering means displaying content as it becomes available rather than waiting for the complete response. In the context of AI applications, this means showing each token on screen the moment it arrives from the streaming API.

TRADITIONAL RENDERING:
  [Loading...]  →  [Loading...]  →  [Loading...]  →  [Full response appears at once]
  t=0s            t=3s              t=7s              t=10s


PROGRESSIVE RENDERING:
  [S]  →  [Str]  →  [Streaming]  →  [Streaming is]  →  [Streaming is great!]
  t=0.2s  t=0.3s    t=0.5s          t=0.7s              t=10s

The user perceives the application as fast and responsive because they see output within 200ms, even though the total response time is identical.

Why progressive rendering matters for AI UIs

Without Progressive RenderingWith Progressive Rendering
Users stare at a spinner for 5-15 secondsUsers see the first word in ~200ms
Users don't know if the app is workingThe "typing" effect confirms activity
Long responses feel slowLong responses feel like a conversation
Users can't start reading until doneUsers read along as content appears
High perceived latencyLow perceived latency
Users abandon after 3-5 secondsUsers stay engaged throughout

2. The Basic Pattern: Streaming State in React

The fundamental pattern for progressive rendering in React involves three pieces: a state variable for accumulated text, a streaming function that appends to that state, and a component that renders the current state.

Minimal streaming component

import { useState, useCallback } from 'react';

function ChatMessage() {
  const [response, setResponse] = useState('');
  const [isStreaming, setIsStreaming] = useState(false);

  const sendMessage = useCallback(async (userMessage) => {
    setResponse('');       // Clear previous response
    setIsStreaming(true);  // Show streaming indicator

    try {
      const res = await fetch('/api/chat', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ message: userMessage }),
      });

      const reader = res.body.getReader();
      const decoder = new TextDecoder();
      let buffer = '';

      while (true) {
        const { done, value } = await reader.read();
        if (done) break;

        buffer += decoder.decode(value, { stream: true });
        const lines = buffer.split('\n\n');
        buffer = lines.pop();

        for (const line of lines) {
          if (!line.startsWith('data: ')) continue;
          const data = JSON.parse(line.slice(6));

          if (data.type === 'token') {
            // Append each token to the response
            setResponse(prev => prev + data.content);
          }
        }
      }
    } catch (error) {
      console.error('Streaming error:', error);
    } finally {
      setIsStreaming(false);
    }
  }, []);

  return (
    <div>
      <div className="response">
        {response}
        {isStreaming && <span className="cursor">|</span>}
      </div>
      <button onClick={() => sendMessage('Explain streaming in React')}>
        Send
      </button>
    </div>
  );
}

Key details:

  • setResponse(prev => prev + data.content) uses the functional updater to avoid stale closures
  • isStreaming drives the blinking cursor indicator
  • The finally block ensures isStreaming is always reset, even on errors

3. Custom Hook: useStreamingChat

Extract the streaming logic into a reusable hook that any component can use:

// hooks/useStreamingChat.js
import { useState, useRef, useCallback } from 'react';

export function useStreamingChat(apiUrl = '/api/chat') {
  const [messages, setMessages] = useState([]);
  const [isStreaming, setIsStreaming] = useState(false);
  const [error, setError] = useState(null);
  const abortControllerRef = useRef(null);

  const sendMessage = useCallback(async (userMessage) => {
    // Add user message to conversation
    const userMsg = { role: 'user', content: userMessage };
    setMessages(prev => [...prev, userMsg]);
    setError(null);
    setIsStreaming(true);

    // Create abort controller for cancellation
    abortControllerRef.current = new AbortController();

    // Add placeholder for assistant response
    const assistantIndex = messages.length + 1; // After user message
    setMessages(prev => [...prev, { role: 'assistant', content: '' }]);

    try {
      const res = await fetch(apiUrl, {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({
          messages: [...messages, userMsg].map(m => ({
            role: m.role,
            content: m.content,
          })),
        }),
        signal: abortControllerRef.current.signal,
      });

      if (!res.ok) {
        throw new Error(`Server error: ${res.status}`);
      }

      const reader = res.body.getReader();
      const decoder = new TextDecoder();
      let buffer = '';

      while (true) {
        const { done, value } = await reader.read();
        if (done) break;

        buffer += decoder.decode(value, { stream: true });
        const lines = buffer.split('\n\n');
        buffer = lines.pop();

        for (const line of lines) {
          if (!line.startsWith('data: ')) continue;
          const data = JSON.parse(line.slice(6));

          if (data.type === 'token') {
            setMessages(prev => {
              const updated = [...prev];
              const lastMsg = updated[updated.length - 1];
              updated[updated.length - 1] = {
                ...lastMsg,
                content: lastMsg.content + data.content,
              };
              return updated;
            });
          }

          if (data.type === 'error') {
            setError(data.message);
          }
        }
      }
    } catch (err) {
      if (err.name !== 'AbortError') {
        setError(err.message);
        // Remove the empty assistant message on error
        setMessages(prev => {
          if (prev[prev.length - 1]?.content === '') {
            return prev.slice(0, -1);
          }
          return prev;
        });
      }
    } finally {
      setIsStreaming(false);
      abortControllerRef.current = null;
    }
  }, [messages, apiUrl]);

  const cancelStream = useCallback(() => {
    if (abortControllerRef.current) {
      abortControllerRef.current.abort();
      setIsStreaming(false);
    }
  }, []);

  const clearMessages = useCallback(() => {
    setMessages([]);
    setError(null);
  }, []);

  return {
    messages,
    isStreaming,
    error,
    sendMessage,
    cancelStream,
    clearMessages,
  };
}

Using the hook in a component

import { useStreamingChat } from './hooks/useStreamingChat';

function ChatApp() {
  const { messages, isStreaming, error, sendMessage, cancelStream, clearMessages } =
    useStreamingChat('/api/chat');
  const [input, setInput] = useState('');

  const handleSubmit = (e) => {
    e.preventDefault();
    if (!input.trim() || isStreaming) return;
    sendMessage(input.trim());
    setInput('');
  };

  return (
    <div className="chat-app">
      <div className="message-list">
        {messages.map((msg, i) => (
          <div key={i} className={`message ${msg.role}`}>
            <strong>{msg.role === 'user' ? 'You' : 'Assistant'}:</strong>
            <div>{msg.content}</div>
            {/* Show cursor on the last assistant message while streaming */}
            {isStreaming && msg.role === 'assistant' && i === messages.length - 1 && (
              <span className="blinking-cursor">|</span>
            )}
          </div>
        ))}
      </div>

      {error && <div className="error">{error}</div>}

      <form onSubmit={handleSubmit}>
        <input
          type="text"
          value={input}
          onChange={(e) => setInput(e.target.value)}
          placeholder="Type a message..."
          disabled={isStreaming}
        />
        {isStreaming ? (
          <button type="button" onClick={cancelStream}>Stop</button>
        ) : (
          <button type="submit">Send</button>
        )}
      </form>
    </div>
  );
}

4. useEffect for Streaming Side Effects

When streaming changes external state (scrolling, focus, analytics), use useEffect to react to streaming updates:

Auto-scroll to bottom as tokens arrive

import { useEffect, useRef } from 'react';

function ChatMessages({ messages, isStreaming }) {
  const bottomRef = useRef(null);
  const containerRef = useRef(null);

  // Auto-scroll when new content arrives during streaming
  useEffect(() => {
    if (!isStreaming) return;

    // Only auto-scroll if user is already near the bottom
    const container = containerRef.current;
    if (!container) return;

    const isNearBottom =
      container.scrollHeight - container.scrollTop - container.clientHeight < 100;

    if (isNearBottom) {
      bottomRef.current?.scrollIntoView({ behavior: 'smooth' });
    }
  }, [messages, isStreaming]); // Re-run when messages update

  return (
    <div ref={containerRef} className="message-container" style={{ overflowY: 'auto' }}>
      {messages.map((msg, i) => (
        <div key={i} className={`message ${msg.role}`}>
          {msg.content}
        </div>
      ))}
      <div ref={bottomRef} />
    </div>
  );
}

Tracking streaming performance metrics

function useStreamingMetrics(isStreaming, messages) {
  const metricsRef = useRef({
    startTime: null,
    firstTokenTime: null,
    tokenCount: 0,
  });

  useEffect(() => {
    if (isStreaming && !metricsRef.current.startTime) {
      metricsRef.current = {
        startTime: Date.now(),
        firstTokenTime: null,
        tokenCount: 0,
      };
    }
  }, [isStreaming]);

  useEffect(() => {
    if (!isStreaming || messages.length === 0) return;

    const lastMessage = messages[messages.length - 1];
    if (lastMessage.role !== 'assistant') return;

    const metrics = metricsRef.current;

    // Record first token time
    if (lastMessage.content.length > 0 && !metrics.firstTokenTime) {
      metrics.firstTokenTime = Date.now();
      console.log(`Time to first token: ${metrics.firstTokenTime - metrics.startTime}ms`);
    }

    // Rough token count (for display purposes)
    metrics.tokenCount = lastMessage.content.split(/\s+/).length;
  }, [messages, isStreaming]);

  useEffect(() => {
    // When streaming stops, log final metrics
    if (!isStreaming && metricsRef.current.startTime) {
      const metrics = metricsRef.current;
      const totalTime = Date.now() - metrics.startTime;
      console.log('Streaming metrics:', {
        timeToFirstToken: metrics.firstTokenTime
          ? metrics.firstTokenTime - metrics.startTime
          : null,
        totalTime,
        approximateTokens: metrics.tokenCount,
        tokensPerSecond: (metrics.tokenCount / (totalTime / 1000)).toFixed(1),
      });
      metricsRef.current = { startTime: null, firstTokenTime: null, tokenCount: 0 };
    }
  }, [isStreaming]);
}

5. Chat-like Progressive Display

Building a proper chat interface requires handling the full conversation lifecycle:

// components/ChatBubble.jsx
function ChatBubble({ message, isLastAssistant, isStreaming }) {
  const isUser = message.role === 'user';

  return (
    <div className={`chat-bubble ${isUser ? 'user-bubble' : 'assistant-bubble'}`}>
      {/* Avatar */}
      <div className="avatar">
        {isUser ? 'You' : 'AI'}
      </div>

      {/* Message content */}
      <div className="bubble-content">
        {message.content || (
          // Empty content during the moment between request and first token
          isLastAssistant && isStreaming && (
            <span className="thinking-dots">
              <span>.</span><span>.</span><span>.</span>
            </span>
          )
        )}
        {/* Blinking cursor while streaming */}
        {isLastAssistant && isStreaming && message.content && (
          <span className="streaming-cursor" aria-hidden="true">|</span>
        )}
      </div>

      {/* Timestamp */}
      <div className="timestamp">
        {new Date(message.timestamp || Date.now()).toLocaleTimeString()}
      </div>
    </div>
  );
}

// components/ChatWindow.jsx
function ChatWindow() {
  const { messages, isStreaming, sendMessage, cancelStream } = useStreamingChat();

  return (
    <div className="chat-window">
      <div className="messages">
        {messages.map((msg, i) => (
          <ChatBubble
            key={i}
            message={msg}
            isLastAssistant={
              msg.role === 'assistant' && i === messages.length - 1
            }
            isStreaming={isStreaming}
          />
        ))}
      </div>

      {/* Streaming status bar */}
      {isStreaming && (
        <div className="status-bar">
          <span className="pulse-dot" />
          <span>AI is responding...</span>
          <button onClick={cancelStream} className="stop-btn">
            Stop generating
          </button>
        </div>
      )}
    </div>
  );
}

CSS for the chat UI streaming effects

/* Blinking cursor effect */
.streaming-cursor {
  display: inline;
  animation: blink 0.8s step-end infinite;
  font-weight: bold;
  color: #3b82f6;
}

@keyframes blink {
  0%, 100% { opacity: 1; }
  50% { opacity: 0; }
}

/* Thinking dots animation */
.thinking-dots span {
  animation: dot-pulse 1.4s infinite;
  opacity: 0;
}
.thinking-dots span:nth-child(1) { animation-delay: 0s; }
.thinking-dots span:nth-child(2) { animation-delay: 0.2s; }
.thinking-dots span:nth-child(3) { animation-delay: 0.4s; }

@keyframes dot-pulse {
  0%, 80%, 100% { opacity: 0; }
  40% { opacity: 1; }
}

/* Pulse dot for status bar */
.pulse-dot {
  width: 8px;
  height: 8px;
  background: #22c55e;
  border-radius: 50%;
  display: inline-block;
  animation: pulse 1.5s ease-in-out infinite;
}

@keyframes pulse {
  0%, 100% { transform: scale(1); opacity: 1; }
  50% { transform: scale(1.5); opacity: 0.5; }
}

/* Chat bubbles */
.chat-bubble {
  display: flex;
  gap: 12px;
  padding: 12px 16px;
  max-width: 80%;
}

.user-bubble {
  align-self: flex-end;
  background: #3b82f6;
  color: white;
  border-radius: 16px 16px 4px 16px;
}

.assistant-bubble {
  align-self: flex-start;
  background: #f3f4f6;
  color: #1f2937;
  border-radius: 16px 16px 16px 4px;
}

/* Stop button */
.stop-btn {
  background: #ef4444;
  color: white;
  border: none;
  border-radius: 6px;
  padding: 4px 12px;
  cursor: pointer;
  font-size: 13px;
}

.stop-btn:hover {
  background: #dc2626;
}

6. Typewriter Effect

A typewriter effect adds characters one at a time with a slight delay, creating a more deliberate "typing" feel than raw streaming (which can be bursty):

import { useState, useEffect, useRef } from 'react';

// Hook that converts bursty streaming into smooth character-by-character display
function useTypewriter(streamedText, charsPerTick = 3, tickMs = 30) {
  const [displayedText, setDisplayedText] = useState('');
  const indexRef = useRef(0);

  useEffect(() => {
    // If streamed text is ahead of displayed text, start the typewriter
    if (indexRef.current >= streamedText.length) return;

    const interval = setInterval(() => {
      const nextIndex = Math.min(
        indexRef.current + charsPerTick,
        streamedText.length
      );

      setDisplayedText(streamedText.slice(0, nextIndex));
      indexRef.current = nextIndex;

      // Catch up: if streamed text is very far ahead, jump ahead
      if (streamedText.length - nextIndex > 200) {
        indexRef.current = streamedText.length - 100;
      }

      // Stop interval when caught up
      if (nextIndex >= streamedText.length) {
        clearInterval(interval);
      }
    }, tickMs);

    return () => clearInterval(interval);
  }, [streamedText, charsPerTick, tickMs]);

  // Reset when text is cleared
  useEffect(() => {
    if (streamedText === '') {
      setDisplayedText('');
      indexRef.current = 0;
    }
  }, [streamedText]);

  const isTyping = indexRef.current < streamedText.length;

  return { displayedText, isTyping };
}

// Usage in a component
function TypewriterMessage({ streamedContent, isStreaming }) {
  const { displayedText, isTyping } = useTypewriter(streamedContent);

  return (
    <div className="typewriter-message">
      {displayedText}
      {(isTyping || isStreaming) && (
        <span className="streaming-cursor">|</span>
      )}
    </div>
  );
}

When to use typewriter vs raw streaming

ApproachBest ForTrade-off
Raw streaming (display every token immediately)Technical users, code output, fast connectionsCan feel "bursty" — tokens arrive in uneven chunks
Typewriter (smooth character reveal)Consumer-facing chat, marketing sitesAdds artificial delay, can fall behind on long responses
Hybrid (typewriter with catch-up)Most production appsBest balance — smooth when slow, catches up when fast

7. Handling Markdown in Streaming Text

LLMs often respond with markdown (headers, bold text, code blocks, lists). Rendering markdown during streaming is tricky because incomplete markdown is syntactically invalid.

The problem

Token 1: "Here are **three"
Token 2: " benefits"
Token 3: "**:\n\n1."
Token 4: " First benefit"

After token 1: "Here are **three"       → Unclosed bold tag
After token 2: "Here are **three benefits" → Still unclosed
After token 3: "Here are **three benefits**:\n\n1." → Now valid!

If you render markdown after every token, the display will flash and reformat as incomplete syntax becomes complete.

Solution 1: Use a markdown library that handles incomplete input

import ReactMarkdown from 'react-markdown';
import remarkGfm from 'remark-gfm';

function StreamingMarkdown({ content, isStreaming }) {
  return (
    <div className="markdown-content">
      <ReactMarkdown
        remarkPlugins={[remarkGfm]}
        // react-markdown handles incomplete markdown gracefully —
        // unclosed bold/italic just renders as text until closed
      >
        {content}
      </ReactMarkdown>
      {isStreaming && <span className="streaming-cursor">|</span>}
    </div>
  );
}

Solution 2: Debounced markdown rendering

Only re-render the markdown at intervals, not on every token:

import { useState, useEffect, useRef } from 'react';
import ReactMarkdown from 'react-markdown';

function DebouncedMarkdown({ streamedText, isStreaming, debounceMs = 100 }) {
  const [renderedText, setRenderedText] = useState('');
  const timeoutRef = useRef(null);

  useEffect(() => {
    // While streaming, debounce markdown rendering
    if (isStreaming) {
      clearTimeout(timeoutRef.current);
      timeoutRef.current = setTimeout(() => {
        setRenderedText(streamedText);
      }, debounceMs);
    } else {
      // When streaming stops, render immediately
      clearTimeout(timeoutRef.current);
      setRenderedText(streamedText);
    }

    return () => clearTimeout(timeoutRef.current);
  }, [streamedText, isStreaming, debounceMs]);

  return (
    <div className="markdown-content">
      <ReactMarkdown>{renderedText}</ReactMarkdown>
    </div>
  );
}

Solution 3: Split rendering strategy — raw text while streaming, markdown after

function SmartMarkdown({ content, isStreaming }) {
  if (isStreaming) {
    // While streaming: render as plain text with basic formatting
    return (
      <div className="streaming-text">
        <pre style={{ whiteSpace: 'pre-wrap', fontFamily: 'inherit' }}>
          {content}
          <span className="streaming-cursor">|</span>
        </pre>
      </div>
    );
  }

  // After streaming: render full markdown
  return (
    <div className="markdown-content">
      <ReactMarkdown remarkPlugins={[remarkGfm]}>
        {content}
      </ReactMarkdown>
    </div>
  );
}

Solution 4: Code block detection during streaming

Code blocks are especially problematic because partial code blocks look broken. Detect and handle them:

function StreamingWithCodeBlocks({ content, isStreaming }) {
  // Check if we're inside an unclosed code block
  const codeBlockCount = (content.match(/```/g) || []).length;
  const isInsideCodeBlock = codeBlockCount % 2 !== 0;

  // If inside an unclosed code block, close it temporarily for rendering
  const renderContent = isInsideCodeBlock
    ? content + '\n```'  // Temporarily close for valid markdown
    : content;

  return (
    <div className="markdown-content">
      <ReactMarkdown remarkPlugins={[remarkGfm]}>
        {renderContent}
      </ReactMarkdown>
      {isStreaming && <span className="streaming-cursor">|</span>}
    </div>
  );
}

8. Next.js Streaming Patterns

Next.js has built-in support for streaming responses, making it ideal for AI applications.

Route Handler with streaming (App Router)

// app/api/chat/route.js
import OpenAI from 'openai';

const openai = new OpenAI();

export async function POST(request) {
  const { messages } = await request.json();

  const stream = await openai.chat.completions.create({
    model: 'gpt-4o',
    messages: [
      { role: 'system', content: 'You are a helpful assistant.' },
      ...messages,
    ],
    stream: true,
  });

  // Create a ReadableStream that forwards the OpenAI stream
  const encoder = new TextEncoder();

  const readableStream = new ReadableStream({
    async start(controller) {
      try {
        for await (const chunk of stream) {
          const content = chunk.choices[0]?.delta?.content;
          if (content) {
            // Send as SSE format
            const data = `data: ${JSON.stringify({ content })}\n\n`;
            controller.enqueue(encoder.encode(data));
          }
        }
        controller.enqueue(encoder.encode('data: [DONE]\n\n'));
        controller.close();
      } catch (error) {
        controller.error(error);
      }
    },
  });

  return new Response(readableStream, {
    headers: {
      'Content-Type': 'text/event-stream',
      'Cache-Control': 'no-cache',
      'Connection': 'keep-alive',
    },
  });
}

Client component consuming the stream

// app/components/Chat.jsx
'use client';

import { useState, useCallback } from 'react';

export default function Chat() {
  const [messages, setMessages] = useState([]);
  const [input, setInput] = useState('');
  const [isStreaming, setIsStreaming] = useState(false);

  const handleSubmit = useCallback(async (e) => {
    e.preventDefault();
    if (!input.trim() || isStreaming) return;

    const userMessage = { role: 'user', content: input.trim() };
    const updatedMessages = [...messages, userMessage];
    setMessages([...updatedMessages, { role: 'assistant', content: '' }]);
    setInput('');
    setIsStreaming(true);

    try {
      const response = await fetch('/api/chat', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ messages: updatedMessages }),
      });

      const reader = response.body.getReader();
      const decoder = new TextDecoder();
      let buffer = '';

      while (true) {
        const { done, value } = await reader.read();
        if (done) break;

        buffer += decoder.decode(value, { stream: true });
        const lines = buffer.split('\n\n');
        buffer = lines.pop();

        for (const line of lines) {
          if (!line.startsWith('data: ')) continue;
          const raw = line.slice(6);
          if (raw === '[DONE]') break;

          const { content } = JSON.parse(raw);
          if (content) {
            setMessages(prev => {
              const updated = [...prev];
              const last = updated[updated.length - 1];
              updated[updated.length - 1] = {
                ...last,
                content: last.content + content,
              };
              return updated;
            });
          }
        }
      }
    } catch (error) {
      console.error('Stream error:', error);
    } finally {
      setIsStreaming(false);
    }
  }, [input, messages, isStreaming]);

  return (
    <div className="chat-container">
      <div className="messages">
        {messages.map((msg, i) => (
          <div key={i} className={`message ${msg.role}`}>
            <strong>{msg.role}:</strong> {msg.content}
            {isStreaming && msg.role === 'assistant' && i === messages.length - 1 && (
              <span className="streaming-cursor">|</span>
            )}
          </div>
        ))}
      </div>

      <form onSubmit={handleSubmit}>
        <input
          value={input}
          onChange={(e) => setInput(e.target.value)}
          placeholder="Ask something..."
          disabled={isStreaming}
        />
        <button type="submit" disabled={isStreaming}>
          {isStreaming ? 'Streaming...' : 'Send'}
        </button>
      </form>
    </div>
  );
}

Using the Vercel AI SDK (streamlined approach)

The Vercel AI SDK abstracts away the streaming boilerplate:

// app/api/chat/route.js — using Vercel AI SDK
import { openai } from '@ai-sdk/openai';
import { streamText } from 'ai';

export async function POST(request) {
  const { messages } = await request.json();

  const result = streamText({
    model: openai('gpt-4o'),
    system: 'You are a helpful assistant.',
    messages,
  });

  return result.toDataStreamResponse();
}
// app/components/Chat.jsx — using useChat hook
'use client';

import { useChat } from 'ai/react';

export default function Chat() {
  const { messages, input, handleInputChange, handleSubmit, isLoading, stop } =
    useChat();

  return (
    <div>
      <div className="messages">
        {messages.map((msg) => (
          <div key={msg.id} className={msg.role}>
            <strong>{msg.role}:</strong> {msg.content}
          </div>
        ))}
      </div>

      <form onSubmit={handleSubmit}>
        <input value={input} onChange={handleInputChange} />
        {isLoading ? (
          <button type="button" onClick={stop}>Stop</button>
        ) : (
          <button type="submit">Send</button>
        )}
      </form>
    </div>
  );
}

This reduces a full streaming chat implementation to ~30 lines of code.


9. Performance Optimization for Progressive Rendering

Rendering every single token triggers a React re-render. For fast streams (50+ tokens/second), this can cause performance issues.

Batching state updates

import { useRef, useState, useCallback } from 'react';

function useBufferedStream() {
  const [displayText, setDisplayText] = useState('');
  const bufferRef = useRef('');
  const rafRef = useRef(null);

  const appendToken = useCallback((token) => {
    bufferRef.current += token;

    // Use requestAnimationFrame to batch updates to 60fps
    if (!rafRef.current) {
      rafRef.current = requestAnimationFrame(() => {
        setDisplayText(prev => prev + bufferRef.current);
        bufferRef.current = '';
        rafRef.current = null;
      });
    }
  }, []);

  const reset = useCallback(() => {
    setDisplayText('');
    bufferRef.current = '';
    if (rafRef.current) {
      cancelAnimationFrame(rafRef.current);
      rafRef.current = null;
    }
  }, []);

  return { displayText, appendToken, reset };
}

Virtualized message list for long conversations

import { useRef, useEffect } from 'react';

// For conversations with 100+ messages, only render visible ones
function VirtualizedMessages({ messages, isStreaming }) {
  const containerRef = useRef(null);
  const [visibleRange, setVisibleRange] = useState({ start: 0, end: 20 });

  // Always show the last few messages (where streaming happens)
  const minVisible = Math.max(0, messages.length - 20);
  const start = Math.min(visibleRange.start, minVisible);
  const end = messages.length;

  const visibleMessages = messages.slice(start, end);

  return (
    <div ref={containerRef} className="message-container" style={{ overflowY: 'auto' }}>
      {/* Spacer for messages above the visible range */}
      {start > 0 && (
        <div style={{ height: start * 80 }} /> // Estimate 80px per message
      )}

      {visibleMessages.map((msg, i) => (
        <div key={start + i} className={`message ${msg.role}`}>
          {msg.content}
        </div>
      ))}
    </div>
  );
}

10. Accessibility in Streaming UIs

Streaming text updates create accessibility challenges. Screen readers need to be informed about dynamic content:

function AccessibleStreamingMessage({ content, isStreaming }) {
  return (
    <div
      role="log"
      aria-live="polite"       // Announce new content without interrupting
      aria-atomic="false"      // Only announce NEW additions, not the full text
      aria-busy={isStreaming}  // Indicates content is still loading
      aria-label="AI response"
    >
      {content}
      {isStreaming && (
        <span className="sr-only">
          AI is still generating a response...
        </span>
      )}
    </div>
  );
}

// CSS for screen-reader only text
// .sr-only {
//   position: absolute;
//   width: 1px;
//   height: 1px;
//   overflow: hidden;
//   clip: rect(0, 0, 0, 0);
//   white-space: nowrap;
// }

11. Key Takeaways

  1. Progressive rendering shows tokens as they arrive — users see content in ~200ms instead of waiting 5-15 seconds.
  2. Use setResponse(prev => prev + token) with React's functional state updater to avoid stale closure bugs.
  3. Extract streaming logic into a custom hook (useStreamingChat) for reuse across components.
  4. Handle markdown carefully — incomplete markdown during streaming can cause flickering; use debounced rendering or temporary tag closing.
  5. The Vercel AI SDK reduces streaming chat to ~30 lines with useChat and streamText.
  6. Batch React state updates with requestAnimationFrame to maintain 60fps during fast token streams.
  7. Don't forget accessibility — use aria-live, aria-busy, and screen-reader hints for dynamic streaming content.

Explain-It Challenge

  1. A designer asks "why does the text flicker when the AI writes bold text?" Explain the incomplete markdown problem and how you would fix it.
  2. Your streaming chat works but becomes sluggish after 50+ messages in the conversation. What causes this and what would you change?
  3. A QA engineer reports that the streaming cursor stays visible after the response finishes. Walk through the state management issue and the fix.

Navigation: <- 4.8.a — Streaming Tokens | 4.8.c — Improving UX in AI Applications ->