Episode 3 — NodeJS MongoDB Backend Architecture / 3.15 — Realtime Communication WebSockets
3.15.b — HTTP Polling and Alternatives
Before WebSocket became the standard for real-time communication, developers relied on creative HTTP-based workarounds like short polling, long polling, and Server-Sent Events. Understanding these alternatives clarifies why WebSocket exists and when simpler approaches are sufficient.
<< Previous: 3.15.a — Understanding WebSockets | Next: 3.15.c — Socket.io Setup & Basics >>
1. Short Polling: Repeated HTTP Requests on an Interval
Short polling is the simplest approach to "real-time" updates. The client sends HTTP requests at regular intervals (e.g., every 3 seconds) to check for new data.
Client: GET /api/messages → Response: [] (nothing new)
... 3 seconds later ...
Client: GET /api/messages → Response: [] (nothing new)
... 3 seconds later ...
Client: GET /api/messages → Response: [msg1] (got something!)
... 3 seconds later ...
Client: GET /api/messages → Response: [] (nothing new again)
Implementation
// ========== SERVER (Express) ==========
const express = require('express');
const app = express();
let messages = [];
app.get('/api/messages', (req, res) => {
const since = parseInt(req.query.since) || 0;
const newMessages = messages.filter(m => m.timestamp > since);
res.json({ messages: newMessages, serverTime: Date.now() });
});
app.post('/api/messages', express.json(), (req, res) => {
const message = {
id: messages.length + 1,
text: req.body.text,
user: req.body.user,
timestamp: Date.now()
};
messages.push(message);
res.status(201).json(message);
});
app.listen(3000);
// ========== CLIENT ==========
let lastCheckTime = 0;
function pollForMessages() {
setInterval(async () => {
try {
const res = await fetch(`/api/messages?since=${lastCheckTime}`);
const data = await res.json();
if (data.messages.length > 0) {
data.messages.forEach(msg => displayMessage(msg));
}
lastCheckTime = data.serverTime;
} catch (error) {
console.error('Polling failed:', error);
}
}, 3000); // Poll every 3 seconds
}
pollForMessages();
Pros and Cons
| Pros | Cons |
|---|---|
| Dead simple to implement | Wastes bandwidth (empty responses) |
| Works everywhere (just HTTP) | Latency = polling interval (up to 3s delay) |
| Easy to debug | High server load (many requests) |
| No special server setup | Not truly real-time |
| Stateless server | Polling interval tradeoff: fast = expensive, slow = laggy |
2. Long Polling: Server Holds Request Until Data Available
Long polling improves on short polling by having the server hold the request open until new data is available or a timeout occurs. This reduces wasted requests while providing near-instant updates.
Client: GET /api/messages/poll → (server holds connection...)
... 15 seconds of waiting ...
Server: Response: [msg1] (new data arrived, respond immediately!)
Client: GET /api/messages/poll → (immediately reconnects, server holds...)
... 3 seconds later ...
Server: Response: [msg2] (new data, respond!)
Client: GET /api/messages/poll → (reconnects again...)
... 30 seconds, no data ...
Server: Response: [] (timeout, respond empty)
Client: GET /api/messages/poll → (reconnects...)
Implementation
// ========== SERVER (Express) ==========
const express = require('express');
const app = express();
let messages = [];
let waitingClients = []; // Clients waiting for new data
app.get('/api/messages/poll', (req, res) => {
const since = parseInt(req.query.since) || 0;
// Check if there's already new data
const newMessages = messages.filter(m => m.timestamp > since);
if (newMessages.length > 0) {
return res.json({ messages: newMessages, serverTime: Date.now() });
}
// No new data — hold the connection
const client = { res, since };
waitingClients.push(client);
// Timeout after 30 seconds to prevent connection hanging forever
const timeout = setTimeout(() => {
waitingClients = waitingClients.filter(c => c !== client);
res.json({ messages: [], serverTime: Date.now() });
}, 30000);
// Clean up if client disconnects
req.on('close', () => {
clearTimeout(timeout);
waitingClients = waitingClients.filter(c => c !== client);
});
});
app.post('/api/messages', express.json(), (req, res) => {
const message = {
id: messages.length + 1,
text: req.body.text,
user: req.body.user,
timestamp: Date.now()
};
messages.push(message);
// Notify ALL waiting clients immediately
waitingClients.forEach(client => {
const newMsgs = messages.filter(m => m.timestamp > client.since);
client.res.json({ messages: newMsgs, serverTime: Date.now() });
});
waitingClients = []; // Clear the waiting list
res.status(201).json(message);
});
app.listen(3000);
// ========== CLIENT ==========
let lastCheckTime = 0;
async function longPoll() {
while (true) {
try {
const res = await fetch(`/api/messages/poll?since=${lastCheckTime}`);
const data = await res.json();
if (data.messages.length > 0) {
data.messages.forEach(msg => displayMessage(msg));
}
lastCheckTime = data.serverTime;
// Immediately reconnect (no delay needed)
} catch (error) {
console.error('Long poll error:', error);
// Wait before retrying on error
await new Promise(resolve => setTimeout(resolve, 3000));
}
}
}
longPoll();
Pros and Cons
| Pros | Cons |
|---|---|
| Near-instant delivery | More complex server logic |
| Fewer wasted requests than short polling | Server holds many open connections |
| Works everywhere (just HTTP) | Connection timeout management needed |
| No special protocol needed | Still one-directional (client initiates) |
| Better than short polling for latency | Memory overhead per waiting client |
3. Server-Sent Events (SSE): One-Way Server to Client
SSE provides a persistent, one-way connection where the server can push updates to the client. The client uses the EventSource API, which handles reconnection automatically.
Client: GET /api/events (Accept: text/event-stream)
Server: (connection stays open)
Server: data: {"type": "message", "text": "Hello"}\n\n
Server: data: {"type": "notification", "count": 5}\n\n
... connection stays open, server pushes whenever ...
Implementation
// ========== SERVER (Express) ==========
const express = require('express');
const app = express();
let clients = [];
app.get('/api/events', (req, res) => {
// Set SSE headers
res.setHeader('Content-Type', 'text/event-stream');
res.setHeader('Cache-Control', 'no-cache');
res.setHeader('Connection', 'keep-alive');
res.setHeader('Access-Control-Allow-Origin', '*');
// Send initial connection confirmation
res.write('data: {"type": "connected"}\n\n');
// Add this client to the list
const client = { id: Date.now(), res };
clients.push(client);
console.log(`Client connected. Total: ${clients.length}`);
// Remove client on disconnect
req.on('close', () => {
clients = clients.filter(c => c.id !== client.id);
console.log(`Client disconnected. Total: ${clients.length}`);
});
});
// Helper to broadcast to all SSE clients
function broadcast(eventData) {
clients.forEach(client => {
client.res.write(`data: ${JSON.stringify(eventData)}\n\n`);
});
}
// When a new message is posted, broadcast via SSE
app.post('/api/messages', express.json(), (req, res) => {
const message = {
type: 'new-message',
text: req.body.text,
user: req.body.user,
timestamp: Date.now()
};
broadcast(message);
res.status(201).json({ success: true });
});
app.listen(3000);
// ========== CLIENT ==========
// EventSource is a built-in browser API
const eventSource = new EventSource('/api/events');
eventSource.onmessage = (event) => {
const data = JSON.parse(event.data);
console.log('Received:', data);
if (data.type === 'new-message') {
displayMessage(data);
}
};
eventSource.onerror = (error) => {
console.error('SSE error:', error);
// EventSource automatically reconnects!
};
// Named events (optional)
eventSource.addEventListener('notification', (event) => {
const data = JSON.parse(event.data);
showNotification(data);
});
SSE Data Format
data: Simple text message\n\n
data: {"json": "works too"}\n\n
event: notification\n
data: {"title": "New follower"}\n\n
id: 42\n
data: Message with ID for reconnection\n\n
retry: 5000\n
data: Set reconnection interval to 5 seconds\n\n
Pros and Cons
| Pros | Cons |
|---|---|
Simple API (EventSource) | One-way only (server to client) |
| Automatic reconnection built-in | Text-only (no binary data) |
| Event ID tracking for missed messages | Limited to ~6 connections per domain |
| Works over standard HTTP | No IE support (polyfill available) |
| Lightweight, no extra libraries | Client cannot send data back over SSE |
4. Performance and Resource Comparison
| Metric | Short Polling | Long Polling | SSE | WebSocket |
|---|---|---|---|---|
| Avg latency | ~half the interval | ~50-200ms | ~10-50ms | ~5-20ms |
| Bandwidth waste | Very High | Medium | Low | Very Low |
| Server connections | Low (brief) | High (held open) | High (persistent) | High (persistent) |
| Requests/min (100 users) | ~2000 (3s interval) | ~100-200 | 0 (persistent) | 0 (persistent) |
| Bidirectional? | No | No | No | Yes |
| Binary data? | Via encoding | Via encoding | No | Yes |
| Max concurrent (typical) | Unlimited | ~10K per server | ~10K per server | ~50K+ per server |
| Complexity | Very Low | Medium | Low | Medium |
5. Why Socket.io Exists
Raw WebSocket works, but real-world applications need more:
| Problem with Raw WebSocket | Socket.io Solution |
|---|---|
| No automatic reconnection | Built-in reconnection with exponential backoff |
| No fallback if WS blocked | Falls back to long polling automatically |
| No rooms/groups concept | Built-in rooms and namespaces |
| No broadcasting helpers | io.emit(), socket.broadcast.emit() |
| No acknowledgements | Callback-based acks: emit('event', data, callback) |
| No middleware system | io.use() for auth, logging, etc. |
| Binary requires manual handling | Automatic binary detection and handling |
| No multiplexing | Namespaces for separate channels on same connection |
// Raw WebSocket — manual everything
const ws = new WebSocket('ws://localhost:3000');
ws.onclose = () => {
// Must manually implement reconnection
setTimeout(() => {
// reconnect logic...
}, 1000);
};
// Socket.io — batteries included
const socket = io('http://localhost:3000', {
reconnection: true, // automatic!
reconnectionAttempts: 5,
reconnectionDelay: 1000,
reconnectionDelayMax: 5000
});
Important: Socket.io is NOT a WebSocket implementation. It is a library that uses WebSocket as transport when available, but adds its own protocol layer on top. A Socket.io client cannot connect to a plain WebSocket server and vice versa.
6. Decision Flowchart: Which Approach to Use?
Do you need real-time updates?
├── No → Use standard HTTP REST API
└── Yes
├── Is data flow one-way (server → client only)?
│ ├── Yes → Use SSE (Server-Sent Events)
│ └── No (bidirectional needed)
│ ├── Are updates very frequent (multiple per second)?
│ │ ├── Yes → Use WebSocket / Socket.io
│ │ └── No (every few seconds)
│ │ ├── Is simplicity the top priority?
│ │ │ ├── Yes → Long Polling
│ │ │ └── No → WebSocket / Socket.io
│ │ └──
│ └──
└──
Key Takeaways
- Short polling is simplest but wasteful -- constantly sends requests even when nothing has changed
- Long polling reduces wasted requests by holding connections open until data arrives
- SSE is ideal for one-way server-to-client updates with automatic reconnection
- WebSocket is the only option for true bidirectional real-time communication
- Socket.io wraps WebSocket with reconnection, fallbacks, rooms, and middleware
- Socket.io is NOT a WebSocket library -- it has its own protocol and they are not interchangeable
- Choose the simplest approach that meets your requirements
Explain-It Challenge
Scenario: You are building a live auction platform. Users can view current bids (read-heavy), place new bids (write occasionally), and see a countdown timer for each auction. The platform also needs to show "X users are watching this auction" in real-time.
For each feature listed, recommend the best communication approach (short polling, long polling, SSE, or WebSocket) and explain your reasoning. Consider the tradeoffs in bandwidth, latency, and implementation complexity for each.
<< Previous: 3.15.a — Understanding WebSockets | Next: 3.15.c — Socket.io Setup & Basics >>