Episode 3 — NodeJS MongoDB Backend Architecture / 3.15 — Realtime Communication WebSockets

3.15.b — HTTP Polling and Alternatives

Before WebSocket became the standard for real-time communication, developers relied on creative HTTP-based workarounds like short polling, long polling, and Server-Sent Events. Understanding these alternatives clarifies why WebSocket exists and when simpler approaches are sufficient.

<< Previous: 3.15.a — Understanding WebSockets | Next: 3.15.c — Socket.io Setup & Basics >>

1. Short Polling: Repeated HTTP Requests on an Interval

Short polling is the simplest approach to "real-time" updates. The client sends HTTP requests at regular intervals (e.g., every 3 seconds) to check for new data.

Client: GET /api/messages → Response: []         (nothing new)
... 3 seconds later ...
Client: GET /api/messages → Response: []         (nothing new)
... 3 seconds later ...
Client: GET /api/messages → Response: [msg1]     (got something!)
... 3 seconds later ...
Client: GET /api/messages → Response: []         (nothing new again)

Implementation

// ========== SERVER (Express) ==========
const express = require('express');
const app = express();

let messages = [];

app.get('/api/messages', (req, res) => {
  const since = parseInt(req.query.since) || 0;
  const newMessages = messages.filter(m => m.timestamp > since);
  res.json({ messages: newMessages, serverTime: Date.now() });
});

app.post('/api/messages', express.json(), (req, res) => {
  const message = {
    id: messages.length + 1,
    text: req.body.text,
    user: req.body.user,
    timestamp: Date.now()
  };
  messages.push(message);
  res.status(201).json(message);
});

app.listen(3000);

// ========== CLIENT ==========
let lastCheckTime = 0;

function pollForMessages() {
  setInterval(async () => {
    try {
      const res = await fetch(`/api/messages?since=${lastCheckTime}`);
      const data = await res.json();

      if (data.messages.length > 0) {
        data.messages.forEach(msg => displayMessage(msg));
      }

      lastCheckTime = data.serverTime;
    } catch (error) {
      console.error('Polling failed:', error);
    }
  }, 3000); // Poll every 3 seconds
}

pollForMessages();

Pros and Cons

Pros	Cons
Dead simple to implement	Wastes bandwidth (empty responses)
Works everywhere (just HTTP)	Latency = polling interval (up to 3s delay)
Easy to debug	High server load (many requests)
No special server setup	Not truly real-time
Stateless server	Polling interval tradeoff: fast = expensive, slow = laggy

2. Long Polling: Server Holds Request Until Data Available

Long polling improves on short polling by having the server hold the request open until new data is available or a timeout occurs. This reduces wasted requests while providing near-instant updates.

Client: GET /api/messages/poll → (server holds connection...)
... 15 seconds of waiting ...
Server: Response: [msg1]        (new data arrived, respond immediately!)
Client: GET /api/messages/poll → (immediately reconnects, server holds...)
... 3 seconds later ...
Server: Response: [msg2]        (new data, respond!)
Client: GET /api/messages/poll → (reconnects again...)
... 30 seconds, no data ...
Server: Response: []             (timeout, respond empty)
Client: GET /api/messages/poll → (reconnects...)

Implementation

// ========== SERVER (Express) ==========
const express = require('express');
const app = express();

let messages = [];
let waitingClients = []; // Clients waiting for new data

app.get('/api/messages/poll', (req, res) => {
  const since = parseInt(req.query.since) || 0;

  // Check if there's already new data
  const newMessages = messages.filter(m => m.timestamp > since);
  if (newMessages.length > 0) {
    return res.json({ messages: newMessages, serverTime: Date.now() });
  }

  // No new data — hold the connection
  const client = { res, since };
  waitingClients.push(client);

  // Timeout after 30 seconds to prevent connection hanging forever
  const timeout = setTimeout(() => {
    waitingClients = waitingClients.filter(c => c !== client);
    res.json({ messages: [], serverTime: Date.now() });
  }, 30000);

  // Clean up if client disconnects
  req.on('close', () => {
    clearTimeout(timeout);
    waitingClients = waitingClients.filter(c => c !== client);
  });
});

app.post('/api/messages', express.json(), (req, res) => {
  const message = {
    id: messages.length + 1,
    text: req.body.text,
    user: req.body.user,
    timestamp: Date.now()
  };
  messages.push(message);

  // Notify ALL waiting clients immediately
  waitingClients.forEach(client => {
    const newMsgs = messages.filter(m => m.timestamp > client.since);
    client.res.json({ messages: newMsgs, serverTime: Date.now() });
  });
  waitingClients = []; // Clear the waiting list

  res.status(201).json(message);
});

app.listen(3000);

// ========== CLIENT ==========
let lastCheckTime = 0;

async function longPoll() {
  while (true) {
    try {
      const res = await fetch(`/api/messages/poll?since=${lastCheckTime}`);
      const data = await res.json();

      if (data.messages.length > 0) {
        data.messages.forEach(msg => displayMessage(msg));
      }

      lastCheckTime = data.serverTime;
      // Immediately reconnect (no delay needed)
    } catch (error) {
      console.error('Long poll error:', error);
      // Wait before retrying on error
      await new Promise(resolve => setTimeout(resolve, 3000));
    }
  }
}

longPoll();

Pros and Cons

Pros	Cons
Near-instant delivery	More complex server logic
Fewer wasted requests than short polling	Server holds many open connections
Works everywhere (just HTTP)	Connection timeout management needed
No special protocol needed	Still one-directional (client initiates)
Better than short polling for latency	Memory overhead per waiting client

3. Server-Sent Events (SSE): One-Way Server to Client

SSE provides a persistent, one-way connection where the server can push updates to the client. The client uses the EventSource API, which handles reconnection automatically.

Client: GET /api/events (Accept: text/event-stream)
Server: (connection stays open)
Server: data: {"type": "message", "text": "Hello"}\n\n
Server: data: {"type": "notification", "count": 5}\n\n
... connection stays open, server pushes whenever ...

Implementation

// ========== SERVER (Express) ==========
const express = require('express');
const app = express();

let clients = [];

app.get('/api/events', (req, res) => {
  // Set SSE headers
  res.setHeader('Content-Type', 'text/event-stream');
  res.setHeader('Cache-Control', 'no-cache');
  res.setHeader('Connection', 'keep-alive');
  res.setHeader('Access-Control-Allow-Origin', '*');

  // Send initial connection confirmation
  res.write('data: {"type": "connected"}\n\n');

  // Add this client to the list
  const client = { id: Date.now(), res };
  clients.push(client);
  console.log(`Client connected. Total: ${clients.length}`);

  // Remove client on disconnect
  req.on('close', () => {
    clients = clients.filter(c => c.id !== client.id);
    console.log(`Client disconnected. Total: ${clients.length}`);
  });
});

// Helper to broadcast to all SSE clients
function broadcast(eventData) {
  clients.forEach(client => {
    client.res.write(`data: ${JSON.stringify(eventData)}\n\n`);
  });
}

// When a new message is posted, broadcast via SSE
app.post('/api/messages', express.json(), (req, res) => {
  const message = {
    type: 'new-message',
    text: req.body.text,
    user: req.body.user,
    timestamp: Date.now()
  };

  broadcast(message);
  res.status(201).json({ success: true });
});

app.listen(3000);

// ========== CLIENT ==========
// EventSource is a built-in browser API
const eventSource = new EventSource('/api/events');

eventSource.onmessage = (event) => {
  const data = JSON.parse(event.data);
  console.log('Received:', data);

  if (data.type === 'new-message') {
    displayMessage(data);
  }
};

eventSource.onerror = (error) => {
  console.error('SSE error:', error);
  // EventSource automatically reconnects!
};

// Named events (optional)
eventSource.addEventListener('notification', (event) => {
  const data = JSON.parse(event.data);
  showNotification(data);
});

SSE Data Format

data: Simple text message\n\n

data: {"json": "works too"}\n\n

event: notification\n
data: {"title": "New follower"}\n\n

id: 42\n
data: Message with ID for reconnection\n\n

retry: 5000\n
data: Set reconnection interval to 5 seconds\n\n

Pros and Cons

Pros	Cons
Simple API (`EventSource`)	One-way only (server to client)
Automatic reconnection built-in	Text-only (no binary data)
Event ID tracking for missed messages	Limited to ~6 connections per domain
Works over standard HTTP	No IE support (polyfill available)
Lightweight, no extra libraries	Client cannot send data back over SSE

4. Performance and Resource Comparison

Metric	Short Polling	Long Polling	SSE	WebSocket
Avg latency	~half the interval	~50-200ms	~10-50ms	~5-20ms
Bandwidth waste	Very High	Medium	Low	Very Low
Server connections	Low (brief)	High (held open)	High (persistent)	High (persistent)
Requests/min (100 users)	~2000 (3s interval)	~100-200	0 (persistent)	0 (persistent)
Bidirectional?	No	No	No	Yes
Binary data?	Via encoding	Via encoding	No	Yes
Max concurrent (typical)	Unlimited	~10K per server	~10K per server	~50K+ per server
Complexity	Very Low	Medium	Low	Medium

5. Why Socket.io Exists

Raw WebSocket works, but real-world applications need more:

Problem with Raw WebSocket	Socket.io Solution
No automatic reconnection	Built-in reconnection with exponential backoff
No fallback if WS blocked	Falls back to long polling automatically
No rooms/groups concept	Built-in rooms and namespaces
No broadcasting helpers	`io.emit()`, `socket.broadcast.emit()`
No acknowledgements	Callback-based acks: `emit('event', data, callback)`
No middleware system	`io.use()` for auth, logging, etc.
Binary requires manual handling	Automatic binary detection and handling
No multiplexing	Namespaces for separate channels on same connection

// Raw WebSocket — manual everything
const ws = new WebSocket('ws://localhost:3000');
ws.onclose = () => {
  // Must manually implement reconnection
  setTimeout(() => {
    // reconnect logic...
  }, 1000);
};

// Socket.io — batteries included
const socket = io('http://localhost:3000', {
  reconnection: true,          // automatic!
  reconnectionAttempts: 5,
  reconnectionDelay: 1000,
  reconnectionDelayMax: 5000
});

Important: Socket.io is NOT a WebSocket implementation. It is a library that uses WebSocket as transport when available, but adds its own protocol layer on top. A Socket.io client cannot connect to a plain WebSocket server and vice versa.

6. Decision Flowchart: Which Approach to Use?

Do you need real-time updates?
├── No → Use standard HTTP REST API
└── Yes
    ├── Is data flow one-way (server → client only)?
    │   ├── Yes → Use SSE (Server-Sent Events)
    │   └── No (bidirectional needed)
    │       ├── Are updates very frequent (multiple per second)?
    │       │   ├── Yes → Use WebSocket / Socket.io
    │       │   └── No (every few seconds)
    │       │       ├── Is simplicity the top priority?
    │       │       │   ├── Yes → Long Polling
    │       │       │   └── No → WebSocket / Socket.io
    │       │       └──
    │       └──
    └──

Key Takeaways

Short polling is simplest but wasteful -- constantly sends requests even when nothing has changed
Long polling reduces wasted requests by holding connections open until data arrives
SSE is ideal for one-way server-to-client updates with automatic reconnection
WebSocket is the only option for true bidirectional real-time communication
Socket.io wraps WebSocket with reconnection, fallbacks, rooms, and middleware
Socket.io is NOT a WebSocket library -- it has its own protocol and they are not interchangeable
Choose the simplest approach that meets your requirements

Explain-It Challenge

Scenario: You are building a live auction platform. Users can view current bids (read-heavy), place new bids (write occasionally), and see a countdown timer for each auction. The platform also needs to show "X users are watching this auction" in real-time.

For each feature listed, recommend the best communication approach (short polling, long polling, SSE, or WebSocket) and explain your reasoning. Consider the tradeoffs in bandwidth, latency, and implementation complexity for each.

<< Previous: 3.15.a — Understanding WebSockets | Next: 3.15.c — Socket.io Setup & Basics >>