Episode 3 — NodeJS MongoDB Backend Architecture / 3.15 — Realtime Communication WebSockets

3.15.b — HTTP Polling and Alternatives

Before WebSocket became the standard for real-time communication, developers relied on creative HTTP-based workarounds like short polling, long polling, and Server-Sent Events. Understanding these alternatives clarifies why WebSocket exists and when simpler approaches are sufficient.


<< Previous: 3.15.a — Understanding WebSockets | Next: 3.15.c — Socket.io Setup & Basics >>


1. Short Polling: Repeated HTTP Requests on an Interval

Short polling is the simplest approach to "real-time" updates. The client sends HTTP requests at regular intervals (e.g., every 3 seconds) to check for new data.

Client: GET /api/messages → Response: []         (nothing new)
... 3 seconds later ...
Client: GET /api/messages → Response: []         (nothing new)
... 3 seconds later ...
Client: GET /api/messages → Response: [msg1]     (got something!)
... 3 seconds later ...
Client: GET /api/messages → Response: []         (nothing new again)

Implementation

// ========== SERVER (Express) ==========
const express = require('express');
const app = express();

let messages = [];

app.get('/api/messages', (req, res) => {
  const since = parseInt(req.query.since) || 0;
  const newMessages = messages.filter(m => m.timestamp > since);
  res.json({ messages: newMessages, serverTime: Date.now() });
});

app.post('/api/messages', express.json(), (req, res) => {
  const message = {
    id: messages.length + 1,
    text: req.body.text,
    user: req.body.user,
    timestamp: Date.now()
  };
  messages.push(message);
  res.status(201).json(message);
});

app.listen(3000);
// ========== CLIENT ==========
let lastCheckTime = 0;

function pollForMessages() {
  setInterval(async () => {
    try {
      const res = await fetch(`/api/messages?since=${lastCheckTime}`);
      const data = await res.json();

      if (data.messages.length > 0) {
        data.messages.forEach(msg => displayMessage(msg));
      }

      lastCheckTime = data.serverTime;
    } catch (error) {
      console.error('Polling failed:', error);
    }
  }, 3000); // Poll every 3 seconds
}

pollForMessages();

Pros and Cons

ProsCons
Dead simple to implementWastes bandwidth (empty responses)
Works everywhere (just HTTP)Latency = polling interval (up to 3s delay)
Easy to debugHigh server load (many requests)
No special server setupNot truly real-time
Stateless serverPolling interval tradeoff: fast = expensive, slow = laggy

2. Long Polling: Server Holds Request Until Data Available

Long polling improves on short polling by having the server hold the request open until new data is available or a timeout occurs. This reduces wasted requests while providing near-instant updates.

Client: GET /api/messages/poll → (server holds connection...)
... 15 seconds of waiting ...
Server: Response: [msg1]        (new data arrived, respond immediately!)
Client: GET /api/messages/poll → (immediately reconnects, server holds...)
... 3 seconds later ...
Server: Response: [msg2]        (new data, respond!)
Client: GET /api/messages/poll → (reconnects again...)
... 30 seconds, no data ...
Server: Response: []             (timeout, respond empty)
Client: GET /api/messages/poll → (reconnects...)

Implementation

// ========== SERVER (Express) ==========
const express = require('express');
const app = express();

let messages = [];
let waitingClients = []; // Clients waiting for new data

app.get('/api/messages/poll', (req, res) => {
  const since = parseInt(req.query.since) || 0;

  // Check if there's already new data
  const newMessages = messages.filter(m => m.timestamp > since);
  if (newMessages.length > 0) {
    return res.json({ messages: newMessages, serverTime: Date.now() });
  }

  // No new data — hold the connection
  const client = { res, since };
  waitingClients.push(client);

  // Timeout after 30 seconds to prevent connection hanging forever
  const timeout = setTimeout(() => {
    waitingClients = waitingClients.filter(c => c !== client);
    res.json({ messages: [], serverTime: Date.now() });
  }, 30000);

  // Clean up if client disconnects
  req.on('close', () => {
    clearTimeout(timeout);
    waitingClients = waitingClients.filter(c => c !== client);
  });
});

app.post('/api/messages', express.json(), (req, res) => {
  const message = {
    id: messages.length + 1,
    text: req.body.text,
    user: req.body.user,
    timestamp: Date.now()
  };
  messages.push(message);

  // Notify ALL waiting clients immediately
  waitingClients.forEach(client => {
    const newMsgs = messages.filter(m => m.timestamp > client.since);
    client.res.json({ messages: newMsgs, serverTime: Date.now() });
  });
  waitingClients = []; // Clear the waiting list

  res.status(201).json(message);
});

app.listen(3000);
// ========== CLIENT ==========
let lastCheckTime = 0;

async function longPoll() {
  while (true) {
    try {
      const res = await fetch(`/api/messages/poll?since=${lastCheckTime}`);
      const data = await res.json();

      if (data.messages.length > 0) {
        data.messages.forEach(msg => displayMessage(msg));
      }

      lastCheckTime = data.serverTime;
      // Immediately reconnect (no delay needed)
    } catch (error) {
      console.error('Long poll error:', error);
      // Wait before retrying on error
      await new Promise(resolve => setTimeout(resolve, 3000));
    }
  }
}

longPoll();

Pros and Cons

ProsCons
Near-instant deliveryMore complex server logic
Fewer wasted requests than short pollingServer holds many open connections
Works everywhere (just HTTP)Connection timeout management needed
No special protocol neededStill one-directional (client initiates)
Better than short polling for latencyMemory overhead per waiting client

3. Server-Sent Events (SSE): One-Way Server to Client

SSE provides a persistent, one-way connection where the server can push updates to the client. The client uses the EventSource API, which handles reconnection automatically.

Client: GET /api/events (Accept: text/event-stream)
Server: (connection stays open)
Server: data: {"type": "message", "text": "Hello"}\n\n
Server: data: {"type": "notification", "count": 5}\n\n
... connection stays open, server pushes whenever ...

Implementation

// ========== SERVER (Express) ==========
const express = require('express');
const app = express();

let clients = [];

app.get('/api/events', (req, res) => {
  // Set SSE headers
  res.setHeader('Content-Type', 'text/event-stream');
  res.setHeader('Cache-Control', 'no-cache');
  res.setHeader('Connection', 'keep-alive');
  res.setHeader('Access-Control-Allow-Origin', '*');

  // Send initial connection confirmation
  res.write('data: {"type": "connected"}\n\n');

  // Add this client to the list
  const client = { id: Date.now(), res };
  clients.push(client);
  console.log(`Client connected. Total: ${clients.length}`);

  // Remove client on disconnect
  req.on('close', () => {
    clients = clients.filter(c => c.id !== client.id);
    console.log(`Client disconnected. Total: ${clients.length}`);
  });
});

// Helper to broadcast to all SSE clients
function broadcast(eventData) {
  clients.forEach(client => {
    client.res.write(`data: ${JSON.stringify(eventData)}\n\n`);
  });
}

// When a new message is posted, broadcast via SSE
app.post('/api/messages', express.json(), (req, res) => {
  const message = {
    type: 'new-message',
    text: req.body.text,
    user: req.body.user,
    timestamp: Date.now()
  };

  broadcast(message);
  res.status(201).json({ success: true });
});

app.listen(3000);
// ========== CLIENT ==========
// EventSource is a built-in browser API
const eventSource = new EventSource('/api/events');

eventSource.onmessage = (event) => {
  const data = JSON.parse(event.data);
  console.log('Received:', data);

  if (data.type === 'new-message') {
    displayMessage(data);
  }
};

eventSource.onerror = (error) => {
  console.error('SSE error:', error);
  // EventSource automatically reconnects!
};

// Named events (optional)
eventSource.addEventListener('notification', (event) => {
  const data = JSON.parse(event.data);
  showNotification(data);
});

SSE Data Format

data: Simple text message\n\n

data: {"json": "works too"}\n\n

event: notification\n
data: {"title": "New follower"}\n\n

id: 42\n
data: Message with ID for reconnection\n\n

retry: 5000\n
data: Set reconnection interval to 5 seconds\n\n

Pros and Cons

ProsCons
Simple API (EventSource)One-way only (server to client)
Automatic reconnection built-inText-only (no binary data)
Event ID tracking for missed messagesLimited to ~6 connections per domain
Works over standard HTTPNo IE support (polyfill available)
Lightweight, no extra librariesClient cannot send data back over SSE

4. Performance and Resource Comparison

MetricShort PollingLong PollingSSEWebSocket
Avg latency~half the interval~50-200ms~10-50ms~5-20ms
Bandwidth wasteVery HighMediumLowVery Low
Server connectionsLow (brief)High (held open)High (persistent)High (persistent)
Requests/min (100 users)~2000 (3s interval)~100-2000 (persistent)0 (persistent)
Bidirectional?NoNoNoYes
Binary data?Via encodingVia encodingNoYes
Max concurrent (typical)Unlimited~10K per server~10K per server~50K+ per server
ComplexityVery LowMediumLowMedium

5. Why Socket.io Exists

Raw WebSocket works, but real-world applications need more:

Problem with Raw WebSocketSocket.io Solution
No automatic reconnectionBuilt-in reconnection with exponential backoff
No fallback if WS blockedFalls back to long polling automatically
No rooms/groups conceptBuilt-in rooms and namespaces
No broadcasting helpersio.emit(), socket.broadcast.emit()
No acknowledgementsCallback-based acks: emit('event', data, callback)
No middleware systemio.use() for auth, logging, etc.
Binary requires manual handlingAutomatic binary detection and handling
No multiplexingNamespaces for separate channels on same connection
// Raw WebSocket — manual everything
const ws = new WebSocket('ws://localhost:3000');
ws.onclose = () => {
  // Must manually implement reconnection
  setTimeout(() => {
    // reconnect logic...
  }, 1000);
};

// Socket.io — batteries included
const socket = io('http://localhost:3000', {
  reconnection: true,          // automatic!
  reconnectionAttempts: 5,
  reconnectionDelay: 1000,
  reconnectionDelayMax: 5000
});

Important: Socket.io is NOT a WebSocket implementation. It is a library that uses WebSocket as transport when available, but adds its own protocol layer on top. A Socket.io client cannot connect to a plain WebSocket server and vice versa.


6. Decision Flowchart: Which Approach to Use?

Do you need real-time updates?
├── No → Use standard HTTP REST API
└── Yes
    ├── Is data flow one-way (server → client only)?
    │   ├── Yes → Use SSE (Server-Sent Events)
    │   └── No (bidirectional needed)
    │       ├── Are updates very frequent (multiple per second)?
    │       │   ├── Yes → Use WebSocket / Socket.io
    │       │   └── No (every few seconds)
    │       │       ├── Is simplicity the top priority?
    │       │       │   ├── Yes → Long Polling
    │       │       │   └── No → WebSocket / Socket.io
    │       │       └──
    │       └──
    └──

Key Takeaways

  1. Short polling is simplest but wasteful -- constantly sends requests even when nothing has changed
  2. Long polling reduces wasted requests by holding connections open until data arrives
  3. SSE is ideal for one-way server-to-client updates with automatic reconnection
  4. WebSocket is the only option for true bidirectional real-time communication
  5. Socket.io wraps WebSocket with reconnection, fallbacks, rooms, and middleware
  6. Socket.io is NOT a WebSocket library -- it has its own protocol and they are not interchangeable
  7. Choose the simplest approach that meets your requirements

Explain-It Challenge

Scenario: You are building a live auction platform. Users can view current bids (read-heavy), place new bids (write occasionally), and see a countdown timer for each auction. The platform also needs to show "X users are watching this auction" in real-time.

For each feature listed, recommend the best communication approach (short polling, long polling, SSE, or WebSocket) and explain your reasoning. Consider the tradeoffs in bandwidth, latency, and implementation complexity for each.


<< Previous: 3.15.a — Understanding WebSockets | Next: 3.15.c — Socket.io Setup & Basics >>