Episode 1 — Fundamentals / 1.3 — Internet Protocols
1.3.a — What is TCP and Why Is It Widely Used?
In one sentence: TCP (Transmission Control Protocol) is the transport-layer protocol that guarantees your data arrives complete, in order, and without corruption — which is why virtually every webpage, email, and file transfer on the internet depends on it.
Navigation: ← 1.3 Overview · 1.3.b — TCP Three-Way Handshake →
Table of Contents
- 1. What is TCP?
- 2. History
- 3. Why TCP Exists — The Problem It Solves
- 4. Key Features of TCP
- 5. How TCP Guarantees Reliability
- 6. TCP Segment Structure
- 7. TCP Ports — Finding the Right Application
- 8. Flow Control and Congestion Control
- 9. TCP Connection Lifecycle
- 10. Why TCP is Widely Used
- 11. Where TCP is Used (Real-World Examples)
- 12. Disadvantages of TCP
- 13. Key Takeaways
- 14. Explain-It Challenge
1. What is TCP?
TCP stands for Transmission Control Protocol. It is one of the core protocols of the Internet Protocol Suite (TCP/IP) and operates at Layer 4 (Transport) of the OSI model.
TCP provides a reliable, ordered, error-checked stream of bytes between two applications running on different machines over a network. When you load a webpage, download a file, or send an email — TCP is the protocol making sure every byte gets there correctly.
| Property | What it means |
|---|---|
| Connection-oriented | Before any data flows, both sides establish a connection (the three-way handshake) |
| Reliable delivery | Lost packets are detected and retransmitted automatically |
| Ordered | Data arrives in the same order it was sent, regardless of how packets traveled |
| Error-checked | Every segment includes a checksum to catch corruption |
| Full-duplex | Both sides can send and receive data simultaneously |
| Stream-oriented | Applications see a continuous byte stream, not individual packets |
2. History
| Year | Event |
|---|---|
| 1974 | Vint Cerf and Bob Kahn publish the original TCP concept in "A Protocol for Packet Network Intercommunication" |
| 1978 | TCP is split into TCP (reliability) and IP (routing) — creating the TCP/IP model we know today |
| 1981 | TCP formally specified in RFC 793 |
| 1983 | ARPANET switches to TCP/IP — the birthday of the modern internet |
| 2022 | TCP remains the foundation of most internet traffic; HTTP/3 (QUIC) is the first major web protocol to move away from TCP |
TCP has been refined over decades with extensions for congestion control (Reno, Cubic, BBR), selective acknowledgments (SACK), window scaling, timestamps, and more — but the core design from 1981 remains remarkably intact.
3. Why TCP Exists — The Problem It Solves
The Internet Protocol (IP) delivers packets on a best-effort basis. That means IP:
- Does not guarantee delivery — packets can be dropped
- Does not guarantee order — packets can arrive out of sequence
- Does not guarantee integrity — packets can be corrupted
- Does not track connections — each packet is independent
For a webpage to work, you need every HTML tag, every CSS rule, every image byte to arrive correctly. If even one packet of your JavaScript file is missing, the code breaks.
THE PROBLEM TCP SOLVES
Without TCP (raw IP): With TCP:
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Sender │ │ Receiver │ │ Sender │ │ Receiver │
│ │ │ │ │ │ │ │
│ Pkt 1 ───┼──► │ Pkt 1 ✓ │ │ Pkt 1 ───┼──► │ Pkt 1 ✓ │
│ Pkt 2 ───┼──X │ (lost!) │ │ Pkt 2 ───┼──X │ (lost) │
│ Pkt 3 ───┼──► │ Pkt 3 ✓ │ │ │ │ "Where's │
│ │ │ │ │ │◄───┤ pkt 2?" │
│ Result: │ │ Pkt 1, 3 │ │ Pkt 2 ───┼──► │ Pkt 2 ✓ │
│ data is │ │ (broken!)│ │ │ │ │
│ incomplete│ │ │ │ Result: │ │ 1, 2, 3 │
└──────────┘ └──────────┘ │ complete!│ │ (perfect)│
└──────────┘ └──────────┘
TCP sits on top of IP and adds the reliability, ordering, and error-checking that IP lacks.
4. Key Features of TCP
4.1 Connection-oriented
TCP establishes a dedicated connection between sender and receiver before any data is exchanged. This is the three-way handshake (covered in detail in 1.3.b). Both sides agree: "We are now in a conversation."
4.2 Reliable delivery
Every segment the sender transmits must be acknowledged by the receiver. If the sender doesn't receive an acknowledgment within a timeout, it retransmits the segment. No data is silently lost.
4.3 Ordered delivery
Each byte in a TCP stream is assigned a sequence number. The receiver uses these numbers to reassemble data in the correct order, even if packets arrive out of sequence from the network.
4.4 Error detection
Every TCP segment includes a checksum computed over the header and data. The receiver verifies the checksum; if it doesn't match, the segment is discarded and will be retransmitted.
4.5 Flow control
TCP uses a sliding window mechanism. The receiver tells the sender: "I have this much buffer space available" (the receive window). The sender never overwhelms the receiver by sending more than it can handle.
4.6 Congestion control
TCP monitors the network for signs of congestion (packet loss, increasing RTT) and adjusts its sending rate to avoid making congestion worse. Algorithms like Slow Start, Congestion Avoidance, Fast Retransmit, and Fast Recovery (plus modern algorithms like BBR) manage this.
5. How TCP Guarantees Reliability
The acknowledgment and retransmission cycle
SENDER RECEIVER
│ │
│ Segment 1 (seq=1, data="Hello") ──────►│
│ │ ✓ received
│◄────────── ACK (ack=6, "got up to 6") ──│
│ │
│ Segment 2 (seq=6, data="World") ──────►│
│ ╳ │ lost in network!
│ │
│ ... timeout, no ACK received ... │
│ │
│ Segment 2 (seq=6, RETRANSMIT) ────────►│
│ │ ✓ received
│◄────────── ACK (ack=11) ─────────────────│
│ │
│ Both sides know all data was delivered │
Selective Acknowledgments (SACK)
Without SACK, if packet 3 out of packets 1–5 is lost, the receiver can only say "I got up to 2." With SACK (a TCP extension), the receiver says: "I got 1, 2, 4, 5 — only retransmit 3." This is much more efficient.
6. TCP Segment Structure
Every TCP segment has a header (minimum 20 bytes) followed by the data payload:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
┌─────────────────────────┬─────────────────────────┐
│ Source Port (16) │ Destination Port (16) │
├─────────────────────────┴─────────────────────────┤
│ Sequence Number (32) │
├───────────────────────────────────────────────────┤
│ Acknowledgment Number (32) │
├──────┬──────┬─────────────┬───────────────────────┤
│Offset│Reserv│ Flags │ Window Size (16) │
│ (4) │ (3) │(SYN,ACK,FIN│ │
│ │ │ RST,PSH,URG)│ │
├──────┴──────┴─────────────┼───────────────────────┤
│ Checksum (16) │ Urgent Pointer (16) │
├───────────────────────────┴───────────────────────┤
│ Options (if any) │
├───────────────────────────────────────────────────┤
│ DATA │
└───────────────────────────────────────────────────┘
| Field | Purpose |
|---|---|
| Source Port | Which application on the sender |
| Destination Port | Which application on the receiver |
| Sequence Number | Position of the first byte of this segment's data in the overall stream |
| Acknowledgment Number | Next byte the receiver expects (confirms receipt of everything before) |
| Flags | Control bits: SYN (start), ACK (acknowledge), FIN (finish), RST (reset), PSH (push), URG (urgent) |
| Window Size | How many bytes the receiver can accept (flow control) |
| Checksum | Error detection for header + data |
7. TCP Ports — Finding the Right Application
A single server can run many services (web, email, SSH, database). Ports distinguish them. An IP address finds the machine; a port finds the application on that machine.
Server IP: 93.184.216.34
┌─────────────────────────────────────────────────┐
│ Port 80 → HTTP (web server) │
│ Port 443 → HTTPS (web server + TLS) │
│ Port 22 → SSH (remote access) │
│ Port 25 → SMTP (email sending) │
│ Port 5432 → PostgreSQL (database) │
│ Port 3000 → Your dev server (custom) │
└─────────────────────────────────────────────────┘
| Port range | Category |
|---|---|
| 0–1023 | Well-known ports (HTTP, HTTPS, SSH, DNS, etc.) — require privileges |
| 1024–49151 | Registered ports (databases, app servers) |
| 49152–65535 | Dynamic / ephemeral — assigned temporarily for client-side connections |
When your browser connects to a server on port 443, the browser itself uses a random ephemeral port (e.g. 52347) on your machine. The connection is identified by the 4-tuple: (source IP, source port, destination IP, destination port).
8. Flow Control and Congestion Control
Flow control (receiver-side)
Prevents the sender from overwhelming the receiver's buffer.
SENDER RECEIVER
│ │
│ "How much can you handle?" │
│ ─────────────────────────────────► │
│ │
│ "My window is 64KB" │
│ ◄───────────────────────────────── │
│ │
│ Sends up to 64KB of data │
│ ─────────────────────────────────► │
│ │
│ "Processed some. Window now 32KB" │
│ ◄───────────────────────────────── │
│ │
│ Sends up to 32KB more │
Congestion control (network-side)
Prevents the sender from overwhelming the network itself.
| Phase | What happens |
|---|---|
| Slow Start | Start small (1–10 segments); double the sending rate each RTT until a threshold |
| Congestion Avoidance | After the threshold, grow the window linearly (additive increase) |
| Loss detected | Packet loss signals congestion; cut the window (multiplicative decrease) |
| Fast Retransmit | If 3 duplicate ACKs arrive, retransmit the missing segment immediately (don't wait for timeout) |
Modern algorithms like BBR (developed by Google) use bandwidth estimation and RTT measurement rather than relying solely on packet loss as a congestion signal.
9. TCP Connection Lifecycle
Every TCP connection goes through three phases:
┌────────────────────┐
│ 1. ESTABLISHMENT │ Three-way handshake (SYN → SYN-ACK → ACK)
│ (open) │ See 1.3.b for details
└────────┬───────────┘
│
▼
┌────────────────────┐
│ 2. DATA TRANSFER │ Send/receive data with sequence numbers,
│ (open) │ ACKs, flow control, congestion control
└────────┬───────────┘
│
▼
┌────────────────────┐
│ 3. TERMINATION │ Four-way handshake (FIN → ACK → FIN → ACK)
│ (close) │ or RST for abrupt close
└────────────────────┘
Connection termination (four-way close)
CLIENT SERVER
│ │
│ FIN ──────────────────────────────► │ "I'm done sending"
│ │
│ ◄────────────────────────────── ACK │ "Got it"
│ │
│ ◄────────────────────────────── FIN │ "I'm done too"
│ │
│ ACK ──────────────────────────────► │ "Confirmed"
│ │
│ Connection fully closed │
Both sides must agree to close. This ensures all remaining data in transit is delivered before the connection shuts down.
10. Why TCP is Widely Used
| Reason | Explanation |
|---|---|
| Guaranteed delivery | No silent data loss — critical for web pages, APIs, files, email |
| Correct ordering | HTML must be parsed in order; a file downloaded out of order is corrupt |
| Battle-tested | 40+ years of deployment; every OS, every device, every language supports it |
| Application simplicity | Developers write to a stream (like writing to a file); TCP handles all the messy packet stuff |
| Works everywhere | Firewalls, NATs, and networks are all designed to handle TCP traffic |
| Built-in congestion control | TCP doesn't destroy the network under load; it backs off intelligently |
11. Where TCP is Used (Real-World Examples)
| Application | Why TCP |
|---|---|
| Web browsing (HTTP/1.1, HTTP/2) | Every HTML tag, CSS rule, and script byte must arrive correctly |
| Email (SMTP, IMAP, POP3) | Messages must be complete and unaltered |
| File transfer (FTP, SFTP, SCP) | A corrupt file is useless |
| SSH (remote terminal) | Every keystroke and output must arrive in order |
| Database connections (PostgreSQL, MySQL) | Queries and results must be complete and accurate |
| APIs (REST, GraphQL, gRPC) | Request/response integrity is non-negotiable |
| Git operations (push, pull, clone) | Source code must transfer perfectly |
12. Disadvantages of TCP
| Disadvantage | Explanation |
|---|---|
| Higher latency | Connection setup (handshake) and acknowledgments add round trips |
| Head-of-line blocking | If one packet in a stream is lost, all subsequent data waits until it's retransmitted — even if later packets already arrived |
| Overhead | 20+ byte header per segment; ACKs consume bandwidth; state tracking uses memory |
| Not ideal for real-time | Retransmissions can cause jitter (irregular timing) — bad for live audio/video |
| Slow start penalty | New connections start with a small window; takes several RTTs to reach full speed |
These limitations are exactly why UDP exists — and why QUIC (HTTP/3) was built on UDP to avoid TCP's head-of-line blocking at the transport layer.
13. Key Takeaways
- TCP is a connection-oriented, reliable, ordered transport protocol that sits on top of IP.
- It guarantees delivery through sequence numbers, acknowledgments, and retransmissions.
- Flow control (window size) prevents overwhelming the receiver; congestion control prevents overwhelming the network.
- TCP is used by HTTP, email, file transfer, SSH, databases — anything where missing or corrupted data is unacceptable.
- The trade-off: TCP adds latency (handshakes, retransmissions, head-of-line blocking) compared to UDP.
- Every TCP connection has three phases: establish (handshake), transfer (data), terminate (close).
14. Explain-It Challenge
Without looking back, explain in your own words:
- What does "connection-oriented" mean, and why does TCP need it?
- How does TCP know a packet was lost, and what does it do about it?
- What is flow control and how is it different from congestion control?
- Why is TCP a bad choice for live video streaming?
- Name four real-world applications that rely on TCP and explain why each one needs reliability.
Navigation: ← 1.3 Overview · 1.3.b — TCP Three-Way Handshake →