Episode 4 — Generative AI Engineering / 4.8 — Streaming Responses
4.8.b — Progressive Rendering
In one sentence: Progressive rendering takes the raw stream of tokens from the LLM API and displays them in real-time in your UI — turning a blank screen into a responsive, "typing" experience using React state management, streaming patterns, and careful handling of markdown and formatting.
Navigation: <- 4.8.a — Streaming Tokens | 4.8.c — Improving UX in AI Applications ->
1. What Is Progressive Rendering?
Progressive rendering means displaying content as it becomes available rather than waiting for the complete response. In the context of AI applications, this means showing each token on screen the moment it arrives from the streaming API.
TRADITIONAL RENDERING:
[Loading...] → [Loading...] → [Loading...] → [Full response appears at once]
t=0s t=3s t=7s t=10s
PROGRESSIVE RENDERING:
[S] → [Str] → [Streaming] → [Streaming is] → [Streaming is great!]
t=0.2s t=0.3s t=0.5s t=0.7s t=10s
The user perceives the application as fast and responsive because they see output within 200ms, even though the total response time is identical.
Why progressive rendering matters for AI UIs
| Without Progressive Rendering | With Progressive Rendering |
|---|---|
| Users stare at a spinner for 5-15 seconds | Users see the first word in ~200ms |
| Users don't know if the app is working | The "typing" effect confirms activity |
| Long responses feel slow | Long responses feel like a conversation |
| Users can't start reading until done | Users read along as content appears |
| High perceived latency | Low perceived latency |
| Users abandon after 3-5 seconds | Users stay engaged throughout |
2. The Basic Pattern: Streaming State in React
The fundamental pattern for progressive rendering in React involves three pieces: a state variable for accumulated text, a streaming function that appends to that state, and a component that renders the current state.
Minimal streaming component
import { useState, useCallback } from 'react';
function ChatMessage() {
const [response, setResponse] = useState('');
const [isStreaming, setIsStreaming] = useState(false);
const sendMessage = useCallback(async (userMessage) => {
setResponse(''); // Clear previous response
setIsStreaming(true); // Show streaming indicator
try {
const res = await fetch('/api/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ message: userMessage }),
});
const reader = res.body.getReader();
const decoder = new TextDecoder();
let buffer = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split('\n\n');
buffer = lines.pop();
for (const line of lines) {
if (!line.startsWith('data: ')) continue;
const data = JSON.parse(line.slice(6));
if (data.type === 'token') {
// Append each token to the response
setResponse(prev => prev + data.content);
}
}
}
} catch (error) {
console.error('Streaming error:', error);
} finally {
setIsStreaming(false);
}
}, []);
return (
<div>
<div className="response">
{response}
{isStreaming && <span className="cursor">|</span>}
</div>
<button onClick={() => sendMessage('Explain streaming in React')}>
Send
</button>
</div>
);
}
Key details:
setResponse(prev => prev + data.content)uses the functional updater to avoid stale closuresisStreamingdrives the blinking cursor indicator- The
finallyblock ensuresisStreamingis always reset, even on errors
3. Custom Hook: useStreamingChat
Extract the streaming logic into a reusable hook that any component can use:
// hooks/useStreamingChat.js
import { useState, useRef, useCallback } from 'react';
export function useStreamingChat(apiUrl = '/api/chat') {
const [messages, setMessages] = useState([]);
const [isStreaming, setIsStreaming] = useState(false);
const [error, setError] = useState(null);
const abortControllerRef = useRef(null);
const sendMessage = useCallback(async (userMessage) => {
// Add user message to conversation
const userMsg = { role: 'user', content: userMessage };
setMessages(prev => [...prev, userMsg]);
setError(null);
setIsStreaming(true);
// Create abort controller for cancellation
abortControllerRef.current = new AbortController();
// Add placeholder for assistant response
const assistantIndex = messages.length + 1; // After user message
setMessages(prev => [...prev, { role: 'assistant', content: '' }]);
try {
const res = await fetch(apiUrl, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
messages: [...messages, userMsg].map(m => ({
role: m.role,
content: m.content,
})),
}),
signal: abortControllerRef.current.signal,
});
if (!res.ok) {
throw new Error(`Server error: ${res.status}`);
}
const reader = res.body.getReader();
const decoder = new TextDecoder();
let buffer = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split('\n\n');
buffer = lines.pop();
for (const line of lines) {
if (!line.startsWith('data: ')) continue;
const data = JSON.parse(line.slice(6));
if (data.type === 'token') {
setMessages(prev => {
const updated = [...prev];
const lastMsg = updated[updated.length - 1];
updated[updated.length - 1] = {
...lastMsg,
content: lastMsg.content + data.content,
};
return updated;
});
}
if (data.type === 'error') {
setError(data.message);
}
}
}
} catch (err) {
if (err.name !== 'AbortError') {
setError(err.message);
// Remove the empty assistant message on error
setMessages(prev => {
if (prev[prev.length - 1]?.content === '') {
return prev.slice(0, -1);
}
return prev;
});
}
} finally {
setIsStreaming(false);
abortControllerRef.current = null;
}
}, [messages, apiUrl]);
const cancelStream = useCallback(() => {
if (abortControllerRef.current) {
abortControllerRef.current.abort();
setIsStreaming(false);
}
}, []);
const clearMessages = useCallback(() => {
setMessages([]);
setError(null);
}, []);
return {
messages,
isStreaming,
error,
sendMessage,
cancelStream,
clearMessages,
};
}
Using the hook in a component
import { useStreamingChat } from './hooks/useStreamingChat';
function ChatApp() {
const { messages, isStreaming, error, sendMessage, cancelStream, clearMessages } =
useStreamingChat('/api/chat');
const [input, setInput] = useState('');
const handleSubmit = (e) => {
e.preventDefault();
if (!input.trim() || isStreaming) return;
sendMessage(input.trim());
setInput('');
};
return (
<div className="chat-app">
<div className="message-list">
{messages.map((msg, i) => (
<div key={i} className={`message ${msg.role}`}>
<strong>{msg.role === 'user' ? 'You' : 'Assistant'}:</strong>
<div>{msg.content}</div>
{/* Show cursor on the last assistant message while streaming */}
{isStreaming && msg.role === 'assistant' && i === messages.length - 1 && (
<span className="blinking-cursor">|</span>
)}
</div>
))}
</div>
{error && <div className="error">{error}</div>}
<form onSubmit={handleSubmit}>
<input
type="text"
value={input}
onChange={(e) => setInput(e.target.value)}
placeholder="Type a message..."
disabled={isStreaming}
/>
{isStreaming ? (
<button type="button" onClick={cancelStream}>Stop</button>
) : (
<button type="submit">Send</button>
)}
</form>
</div>
);
}
4. useEffect for Streaming Side Effects
When streaming changes external state (scrolling, focus, analytics), use useEffect to react to streaming updates:
Auto-scroll to bottom as tokens arrive
import { useEffect, useRef } from 'react';
function ChatMessages({ messages, isStreaming }) {
const bottomRef = useRef(null);
const containerRef = useRef(null);
// Auto-scroll when new content arrives during streaming
useEffect(() => {
if (!isStreaming) return;
// Only auto-scroll if user is already near the bottom
const container = containerRef.current;
if (!container) return;
const isNearBottom =
container.scrollHeight - container.scrollTop - container.clientHeight < 100;
if (isNearBottom) {
bottomRef.current?.scrollIntoView({ behavior: 'smooth' });
}
}, [messages, isStreaming]); // Re-run when messages update
return (
<div ref={containerRef} className="message-container" style={{ overflowY: 'auto' }}>
{messages.map((msg, i) => (
<div key={i} className={`message ${msg.role}`}>
{msg.content}
</div>
))}
<div ref={bottomRef} />
</div>
);
}
Tracking streaming performance metrics
function useStreamingMetrics(isStreaming, messages) {
const metricsRef = useRef({
startTime: null,
firstTokenTime: null,
tokenCount: 0,
});
useEffect(() => {
if (isStreaming && !metricsRef.current.startTime) {
metricsRef.current = {
startTime: Date.now(),
firstTokenTime: null,
tokenCount: 0,
};
}
}, [isStreaming]);
useEffect(() => {
if (!isStreaming || messages.length === 0) return;
const lastMessage = messages[messages.length - 1];
if (lastMessage.role !== 'assistant') return;
const metrics = metricsRef.current;
// Record first token time
if (lastMessage.content.length > 0 && !metrics.firstTokenTime) {
metrics.firstTokenTime = Date.now();
console.log(`Time to first token: ${metrics.firstTokenTime - metrics.startTime}ms`);
}
// Rough token count (for display purposes)
metrics.tokenCount = lastMessage.content.split(/\s+/).length;
}, [messages, isStreaming]);
useEffect(() => {
// When streaming stops, log final metrics
if (!isStreaming && metricsRef.current.startTime) {
const metrics = metricsRef.current;
const totalTime = Date.now() - metrics.startTime;
console.log('Streaming metrics:', {
timeToFirstToken: metrics.firstTokenTime
? metrics.firstTokenTime - metrics.startTime
: null,
totalTime,
approximateTokens: metrics.tokenCount,
tokensPerSecond: (metrics.tokenCount / (totalTime / 1000)).toFixed(1),
});
metricsRef.current = { startTime: null, firstTokenTime: null, tokenCount: 0 };
}
}, [isStreaming]);
}
5. Chat-like Progressive Display
Building a proper chat interface requires handling the full conversation lifecycle:
// components/ChatBubble.jsx
function ChatBubble({ message, isLastAssistant, isStreaming }) {
const isUser = message.role === 'user';
return (
<div className={`chat-bubble ${isUser ? 'user-bubble' : 'assistant-bubble'}`}>
{/* Avatar */}
<div className="avatar">
{isUser ? 'You' : 'AI'}
</div>
{/* Message content */}
<div className="bubble-content">
{message.content || (
// Empty content during the moment between request and first token
isLastAssistant && isStreaming && (
<span className="thinking-dots">
<span>.</span><span>.</span><span>.</span>
</span>
)
)}
{/* Blinking cursor while streaming */}
{isLastAssistant && isStreaming && message.content && (
<span className="streaming-cursor" aria-hidden="true">|</span>
)}
</div>
{/* Timestamp */}
<div className="timestamp">
{new Date(message.timestamp || Date.now()).toLocaleTimeString()}
</div>
</div>
);
}
// components/ChatWindow.jsx
function ChatWindow() {
const { messages, isStreaming, sendMessage, cancelStream } = useStreamingChat();
return (
<div className="chat-window">
<div className="messages">
{messages.map((msg, i) => (
<ChatBubble
key={i}
message={msg}
isLastAssistant={
msg.role === 'assistant' && i === messages.length - 1
}
isStreaming={isStreaming}
/>
))}
</div>
{/* Streaming status bar */}
{isStreaming && (
<div className="status-bar">
<span className="pulse-dot" />
<span>AI is responding...</span>
<button onClick={cancelStream} className="stop-btn">
Stop generating
</button>
</div>
)}
</div>
);
}
CSS for the chat UI streaming effects
/* Blinking cursor effect */
.streaming-cursor {
display: inline;
animation: blink 0.8s step-end infinite;
font-weight: bold;
color: #3b82f6;
}
@keyframes blink {
0%, 100% { opacity: 1; }
50% { opacity: 0; }
}
/* Thinking dots animation */
.thinking-dots span {
animation: dot-pulse 1.4s infinite;
opacity: 0;
}
.thinking-dots span:nth-child(1) { animation-delay: 0s; }
.thinking-dots span:nth-child(2) { animation-delay: 0.2s; }
.thinking-dots span:nth-child(3) { animation-delay: 0.4s; }
@keyframes dot-pulse {
0%, 80%, 100% { opacity: 0; }
40% { opacity: 1; }
}
/* Pulse dot for status bar */
.pulse-dot {
width: 8px;
height: 8px;
background: #22c55e;
border-radius: 50%;
display: inline-block;
animation: pulse 1.5s ease-in-out infinite;
}
@keyframes pulse {
0%, 100% { transform: scale(1); opacity: 1; }
50% { transform: scale(1.5); opacity: 0.5; }
}
/* Chat bubbles */
.chat-bubble {
display: flex;
gap: 12px;
padding: 12px 16px;
max-width: 80%;
}
.user-bubble {
align-self: flex-end;
background: #3b82f6;
color: white;
border-radius: 16px 16px 4px 16px;
}
.assistant-bubble {
align-self: flex-start;
background: #f3f4f6;
color: #1f2937;
border-radius: 16px 16px 16px 4px;
}
/* Stop button */
.stop-btn {
background: #ef4444;
color: white;
border: none;
border-radius: 6px;
padding: 4px 12px;
cursor: pointer;
font-size: 13px;
}
.stop-btn:hover {
background: #dc2626;
}
6. Typewriter Effect
A typewriter effect adds characters one at a time with a slight delay, creating a more deliberate "typing" feel than raw streaming (which can be bursty):
import { useState, useEffect, useRef } from 'react';
// Hook that converts bursty streaming into smooth character-by-character display
function useTypewriter(streamedText, charsPerTick = 3, tickMs = 30) {
const [displayedText, setDisplayedText] = useState('');
const indexRef = useRef(0);
useEffect(() => {
// If streamed text is ahead of displayed text, start the typewriter
if (indexRef.current >= streamedText.length) return;
const interval = setInterval(() => {
const nextIndex = Math.min(
indexRef.current + charsPerTick,
streamedText.length
);
setDisplayedText(streamedText.slice(0, nextIndex));
indexRef.current = nextIndex;
// Catch up: if streamed text is very far ahead, jump ahead
if (streamedText.length - nextIndex > 200) {
indexRef.current = streamedText.length - 100;
}
// Stop interval when caught up
if (nextIndex >= streamedText.length) {
clearInterval(interval);
}
}, tickMs);
return () => clearInterval(interval);
}, [streamedText, charsPerTick, tickMs]);
// Reset when text is cleared
useEffect(() => {
if (streamedText === '') {
setDisplayedText('');
indexRef.current = 0;
}
}, [streamedText]);
const isTyping = indexRef.current < streamedText.length;
return { displayedText, isTyping };
}
// Usage in a component
function TypewriterMessage({ streamedContent, isStreaming }) {
const { displayedText, isTyping } = useTypewriter(streamedContent);
return (
<div className="typewriter-message">
{displayedText}
{(isTyping || isStreaming) && (
<span className="streaming-cursor">|</span>
)}
</div>
);
}
When to use typewriter vs raw streaming
| Approach | Best For | Trade-off |
|---|---|---|
| Raw streaming (display every token immediately) | Technical users, code output, fast connections | Can feel "bursty" — tokens arrive in uneven chunks |
| Typewriter (smooth character reveal) | Consumer-facing chat, marketing sites | Adds artificial delay, can fall behind on long responses |
| Hybrid (typewriter with catch-up) | Most production apps | Best balance — smooth when slow, catches up when fast |
7. Handling Markdown in Streaming Text
LLMs often respond with markdown (headers, bold text, code blocks, lists). Rendering markdown during streaming is tricky because incomplete markdown is syntactically invalid.
The problem
Token 1: "Here are **three"
Token 2: " benefits"
Token 3: "**:\n\n1."
Token 4: " First benefit"
After token 1: "Here are **three" → Unclosed bold tag
After token 2: "Here are **three benefits" → Still unclosed
After token 3: "Here are **three benefits**:\n\n1." → Now valid!
If you render markdown after every token, the display will flash and reformat as incomplete syntax becomes complete.
Solution 1: Use a markdown library that handles incomplete input
import ReactMarkdown from 'react-markdown';
import remarkGfm from 'remark-gfm';
function StreamingMarkdown({ content, isStreaming }) {
return (
<div className="markdown-content">
<ReactMarkdown
remarkPlugins={[remarkGfm]}
// react-markdown handles incomplete markdown gracefully —
// unclosed bold/italic just renders as text until closed
>
{content}
</ReactMarkdown>
{isStreaming && <span className="streaming-cursor">|</span>}
</div>
);
}
Solution 2: Debounced markdown rendering
Only re-render the markdown at intervals, not on every token:
import { useState, useEffect, useRef } from 'react';
import ReactMarkdown from 'react-markdown';
function DebouncedMarkdown({ streamedText, isStreaming, debounceMs = 100 }) {
const [renderedText, setRenderedText] = useState('');
const timeoutRef = useRef(null);
useEffect(() => {
// While streaming, debounce markdown rendering
if (isStreaming) {
clearTimeout(timeoutRef.current);
timeoutRef.current = setTimeout(() => {
setRenderedText(streamedText);
}, debounceMs);
} else {
// When streaming stops, render immediately
clearTimeout(timeoutRef.current);
setRenderedText(streamedText);
}
return () => clearTimeout(timeoutRef.current);
}, [streamedText, isStreaming, debounceMs]);
return (
<div className="markdown-content">
<ReactMarkdown>{renderedText}</ReactMarkdown>
</div>
);
}
Solution 3: Split rendering strategy — raw text while streaming, markdown after
function SmartMarkdown({ content, isStreaming }) {
if (isStreaming) {
// While streaming: render as plain text with basic formatting
return (
<div className="streaming-text">
<pre style={{ whiteSpace: 'pre-wrap', fontFamily: 'inherit' }}>
{content}
<span className="streaming-cursor">|</span>
</pre>
</div>
);
}
// After streaming: render full markdown
return (
<div className="markdown-content">
<ReactMarkdown remarkPlugins={[remarkGfm]}>
{content}
</ReactMarkdown>
</div>
);
}
Solution 4: Code block detection during streaming
Code blocks are especially problematic because partial code blocks look broken. Detect and handle them:
function StreamingWithCodeBlocks({ content, isStreaming }) {
// Check if we're inside an unclosed code block
const codeBlockCount = (content.match(/```/g) || []).length;
const isInsideCodeBlock = codeBlockCount % 2 !== 0;
// If inside an unclosed code block, close it temporarily for rendering
const renderContent = isInsideCodeBlock
? content + '\n```' // Temporarily close for valid markdown
: content;
return (
<div className="markdown-content">
<ReactMarkdown remarkPlugins={[remarkGfm]}>
{renderContent}
</ReactMarkdown>
{isStreaming && <span className="streaming-cursor">|</span>}
</div>
);
}
8. Next.js Streaming Patterns
Next.js has built-in support for streaming responses, making it ideal for AI applications.
Route Handler with streaming (App Router)
// app/api/chat/route.js
import OpenAI from 'openai';
const openai = new OpenAI();
export async function POST(request) {
const { messages } = await request.json();
const stream = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
...messages,
],
stream: true,
});
// Create a ReadableStream that forwards the OpenAI stream
const encoder = new TextEncoder();
const readableStream = new ReadableStream({
async start(controller) {
try {
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content;
if (content) {
// Send as SSE format
const data = `data: ${JSON.stringify({ content })}\n\n`;
controller.enqueue(encoder.encode(data));
}
}
controller.enqueue(encoder.encode('data: [DONE]\n\n'));
controller.close();
} catch (error) {
controller.error(error);
}
},
});
return new Response(readableStream, {
headers: {
'Content-Type': 'text/event-stream',
'Cache-Control': 'no-cache',
'Connection': 'keep-alive',
},
});
}
Client component consuming the stream
// app/components/Chat.jsx
'use client';
import { useState, useCallback } from 'react';
export default function Chat() {
const [messages, setMessages] = useState([]);
const [input, setInput] = useState('');
const [isStreaming, setIsStreaming] = useState(false);
const handleSubmit = useCallback(async (e) => {
e.preventDefault();
if (!input.trim() || isStreaming) return;
const userMessage = { role: 'user', content: input.trim() };
const updatedMessages = [...messages, userMessage];
setMessages([...updatedMessages, { role: 'assistant', content: '' }]);
setInput('');
setIsStreaming(true);
try {
const response = await fetch('/api/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ messages: updatedMessages }),
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split('\n\n');
buffer = lines.pop();
for (const line of lines) {
if (!line.startsWith('data: ')) continue;
const raw = line.slice(6);
if (raw === '[DONE]') break;
const { content } = JSON.parse(raw);
if (content) {
setMessages(prev => {
const updated = [...prev];
const last = updated[updated.length - 1];
updated[updated.length - 1] = {
...last,
content: last.content + content,
};
return updated;
});
}
}
}
} catch (error) {
console.error('Stream error:', error);
} finally {
setIsStreaming(false);
}
}, [input, messages, isStreaming]);
return (
<div className="chat-container">
<div className="messages">
{messages.map((msg, i) => (
<div key={i} className={`message ${msg.role}`}>
<strong>{msg.role}:</strong> {msg.content}
{isStreaming && msg.role === 'assistant' && i === messages.length - 1 && (
<span className="streaming-cursor">|</span>
)}
</div>
))}
</div>
<form onSubmit={handleSubmit}>
<input
value={input}
onChange={(e) => setInput(e.target.value)}
placeholder="Ask something..."
disabled={isStreaming}
/>
<button type="submit" disabled={isStreaming}>
{isStreaming ? 'Streaming...' : 'Send'}
</button>
</form>
</div>
);
}
Using the Vercel AI SDK (streamlined approach)
The Vercel AI SDK abstracts away the streaming boilerplate:
// app/api/chat/route.js — using Vercel AI SDK
import { openai } from '@ai-sdk/openai';
import { streamText } from 'ai';
export async function POST(request) {
const { messages } = await request.json();
const result = streamText({
model: openai('gpt-4o'),
system: 'You are a helpful assistant.',
messages,
});
return result.toDataStreamResponse();
}
// app/components/Chat.jsx — using useChat hook
'use client';
import { useChat } from 'ai/react';
export default function Chat() {
const { messages, input, handleInputChange, handleSubmit, isLoading, stop } =
useChat();
return (
<div>
<div className="messages">
{messages.map((msg) => (
<div key={msg.id} className={msg.role}>
<strong>{msg.role}:</strong> {msg.content}
</div>
))}
</div>
<form onSubmit={handleSubmit}>
<input value={input} onChange={handleInputChange} />
{isLoading ? (
<button type="button" onClick={stop}>Stop</button>
) : (
<button type="submit">Send</button>
)}
</form>
</div>
);
}
This reduces a full streaming chat implementation to ~30 lines of code.
9. Performance Optimization for Progressive Rendering
Rendering every single token triggers a React re-render. For fast streams (50+ tokens/second), this can cause performance issues.
Batching state updates
import { useRef, useState, useCallback } from 'react';
function useBufferedStream() {
const [displayText, setDisplayText] = useState('');
const bufferRef = useRef('');
const rafRef = useRef(null);
const appendToken = useCallback((token) => {
bufferRef.current += token;
// Use requestAnimationFrame to batch updates to 60fps
if (!rafRef.current) {
rafRef.current = requestAnimationFrame(() => {
setDisplayText(prev => prev + bufferRef.current);
bufferRef.current = '';
rafRef.current = null;
});
}
}, []);
const reset = useCallback(() => {
setDisplayText('');
bufferRef.current = '';
if (rafRef.current) {
cancelAnimationFrame(rafRef.current);
rafRef.current = null;
}
}, []);
return { displayText, appendToken, reset };
}
Virtualized message list for long conversations
import { useRef, useEffect } from 'react';
// For conversations with 100+ messages, only render visible ones
function VirtualizedMessages({ messages, isStreaming }) {
const containerRef = useRef(null);
const [visibleRange, setVisibleRange] = useState({ start: 0, end: 20 });
// Always show the last few messages (where streaming happens)
const minVisible = Math.max(0, messages.length - 20);
const start = Math.min(visibleRange.start, minVisible);
const end = messages.length;
const visibleMessages = messages.slice(start, end);
return (
<div ref={containerRef} className="message-container" style={{ overflowY: 'auto' }}>
{/* Spacer for messages above the visible range */}
{start > 0 && (
<div style={{ height: start * 80 }} /> // Estimate 80px per message
)}
{visibleMessages.map((msg, i) => (
<div key={start + i} className={`message ${msg.role}`}>
{msg.content}
</div>
))}
</div>
);
}
10. Accessibility in Streaming UIs
Streaming text updates create accessibility challenges. Screen readers need to be informed about dynamic content:
function AccessibleStreamingMessage({ content, isStreaming }) {
return (
<div
role="log"
aria-live="polite" // Announce new content without interrupting
aria-atomic="false" // Only announce NEW additions, not the full text
aria-busy={isStreaming} // Indicates content is still loading
aria-label="AI response"
>
{content}
{isStreaming && (
<span className="sr-only">
AI is still generating a response...
</span>
)}
</div>
);
}
// CSS for screen-reader only text
// .sr-only {
// position: absolute;
// width: 1px;
// height: 1px;
// overflow: hidden;
// clip: rect(0, 0, 0, 0);
// white-space: nowrap;
// }
11. Key Takeaways
- Progressive rendering shows tokens as they arrive — users see content in ~200ms instead of waiting 5-15 seconds.
- Use
setResponse(prev => prev + token)with React's functional state updater to avoid stale closure bugs. - Extract streaming logic into a custom hook (
useStreamingChat) for reuse across components. - Handle markdown carefully — incomplete markdown during streaming can cause flickering; use debounced rendering or temporary tag closing.
- The Vercel AI SDK reduces streaming chat to ~30 lines with
useChatandstreamText. - Batch React state updates with
requestAnimationFrameto maintain 60fps during fast token streams. - Don't forget accessibility — use
aria-live,aria-busy, and screen-reader hints for dynamic streaming content.
Explain-It Challenge
- A designer asks "why does the text flicker when the AI writes bold text?" Explain the incomplete markdown problem and how you would fix it.
- Your streaming chat works but becomes sluggish after 50+ messages in the conversation. What causes this and what would you change?
- A QA engineer reports that the streaming cursor stays visible after the response finishes. Walk through the state management issue and the fix.
Navigation: <- 4.8.a — Streaming Tokens | 4.8.c — Improving UX in AI Applications ->