Episode 1 — Fundamentals / 1.3 — Internet Protocols

1.3.a — What is TCP and Why Is It Widely Used?

In one sentence: TCP (Transmission Control Protocol) is the transport-layer protocol that guarantees your data arrives complete, in order, and without corruption — which is why virtually every webpage, email, and file transfer on the internet depends on it.

Navigation: ← 1.3 Overview · 1.3.b — TCP Three-Way Handshake →


Table of Contents


1. What is TCP?

TCP stands for Transmission Control Protocol. It is one of the core protocols of the Internet Protocol Suite (TCP/IP) and operates at Layer 4 (Transport) of the OSI model.

TCP provides a reliable, ordered, error-checked stream of bytes between two applications running on different machines over a network. When you load a webpage, download a file, or send an email — TCP is the protocol making sure every byte gets there correctly.

PropertyWhat it means
Connection-orientedBefore any data flows, both sides establish a connection (the three-way handshake)
Reliable deliveryLost packets are detected and retransmitted automatically
OrderedData arrives in the same order it was sent, regardless of how packets traveled
Error-checkedEvery segment includes a checksum to catch corruption
Full-duplexBoth sides can send and receive data simultaneously
Stream-orientedApplications see a continuous byte stream, not individual packets

2. History

YearEvent
1974Vint Cerf and Bob Kahn publish the original TCP concept in "A Protocol for Packet Network Intercommunication"
1978TCP is split into TCP (reliability) and IP (routing) — creating the TCP/IP model we know today
1981TCP formally specified in RFC 793
1983ARPANET switches to TCP/IP — the birthday of the modern internet
2022TCP remains the foundation of most internet traffic; HTTP/3 (QUIC) is the first major web protocol to move away from TCP

TCP has been refined over decades with extensions for congestion control (Reno, Cubic, BBR), selective acknowledgments (SACK), window scaling, timestamps, and more — but the core design from 1981 remains remarkably intact.


3. Why TCP Exists — The Problem It Solves

The Internet Protocol (IP) delivers packets on a best-effort basis. That means IP:

  • Does not guarantee delivery — packets can be dropped
  • Does not guarantee order — packets can arrive out of sequence
  • Does not guarantee integrity — packets can be corrupted
  • Does not track connections — each packet is independent

For a webpage to work, you need every HTML tag, every CSS rule, every image byte to arrive correctly. If even one packet of your JavaScript file is missing, the code breaks.

                    THE PROBLEM TCP SOLVES

   Without TCP (raw IP):                 With TCP:
   ┌──────────┐    ┌──────────┐         ┌──────────┐    ┌──────────┐
   │ Sender   │    │ Receiver │         │ Sender   │    │ Receiver │
   │          │    │          │         │          │    │          │
   │ Pkt 1 ───┼──► │ Pkt 1 ✓  │         │ Pkt 1 ───┼──► │ Pkt 1 ✓  │
   │ Pkt 2 ───┼──X │ (lost!)  │         │ Pkt 2 ───┼──X │ (lost)   │
   │ Pkt 3 ───┼──► │ Pkt 3 ✓  │         │          │    │ "Where's │
   │          │    │          │         │          │◄───┤  pkt 2?" │
   │ Result:  │    │ Pkt 1, 3 │         │ Pkt 2 ───┼──► │ Pkt 2 ✓  │
   │ data is  │    │ (broken!)│         │          │    │          │
   │ incomplete│    │          │         │ Result:  │    │ 1, 2, 3  │
   └──────────┘    └──────────┘         │ complete!│    │ (perfect)│
                                        └──────────┘    └──────────┘

TCP sits on top of IP and adds the reliability, ordering, and error-checking that IP lacks.


4. Key Features of TCP

4.1 Connection-oriented

TCP establishes a dedicated connection between sender and receiver before any data is exchanged. This is the three-way handshake (covered in detail in 1.3.b). Both sides agree: "We are now in a conversation."

4.2 Reliable delivery

Every segment the sender transmits must be acknowledged by the receiver. If the sender doesn't receive an acknowledgment within a timeout, it retransmits the segment. No data is silently lost.

4.3 Ordered delivery

Each byte in a TCP stream is assigned a sequence number. The receiver uses these numbers to reassemble data in the correct order, even if packets arrive out of sequence from the network.

4.4 Error detection

Every TCP segment includes a checksum computed over the header and data. The receiver verifies the checksum; if it doesn't match, the segment is discarded and will be retransmitted.

4.5 Flow control

TCP uses a sliding window mechanism. The receiver tells the sender: "I have this much buffer space available" (the receive window). The sender never overwhelms the receiver by sending more than it can handle.

4.6 Congestion control

TCP monitors the network for signs of congestion (packet loss, increasing RTT) and adjusts its sending rate to avoid making congestion worse. Algorithms like Slow Start, Congestion Avoidance, Fast Retransmit, and Fast Recovery (plus modern algorithms like BBR) manage this.


5. How TCP Guarantees Reliability

The acknowledgment and retransmission cycle

  SENDER                                    RECEIVER
    │                                          │
    │  Segment 1 (seq=1, data="Hello") ──────►│
    │                                          │  ✓ received
    │◄────────── ACK (ack=6, "got up to 6") ──│
    │                                          │
    │  Segment 2 (seq=6, data="World") ──────►│
    │                                   ╳      │  lost in network!
    │                                          │
    │  ... timeout, no ACK received ...        │
    │                                          │
    │  Segment 2 (seq=6, RETRANSMIT) ────────►│
    │                                          │  ✓ received
    │◄────────── ACK (ack=11) ─────────────────│
    │                                          │
    │  Both sides know all data was delivered   │

Selective Acknowledgments (SACK)

Without SACK, if packet 3 out of packets 1–5 is lost, the receiver can only say "I got up to 2." With SACK (a TCP extension), the receiver says: "I got 1, 2, 4, 5 — only retransmit 3." This is much more efficient.


6. TCP Segment Structure

Every TCP segment has a header (minimum 20 bytes) followed by the data payload:

  0                   1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
 ┌─────────────────────────┬─────────────────────────┐
 │     Source Port (16)     │   Destination Port (16) │
 ├─────────────────────────┴─────────────────────────┤
 │                Sequence Number (32)                │
 ├───────────────────────────────────────────────────┤
 │             Acknowledgment Number (32)             │
 ├──────┬──────┬─────────────┬───────────────────────┤
 │Offset│Reserv│   Flags     │    Window Size (16)    │
 │ (4)  │ (3)  │(SYN,ACK,FIN│                         │
 │      │      │ RST,PSH,URG)│                         │
 ├──────┴──────┴─────────────┼───────────────────────┤
 │      Checksum (16)        │   Urgent Pointer (16)  │
 ├───────────────────────────┴───────────────────────┤
 │              Options (if any)                      │
 ├───────────────────────────────────────────────────┤
 │                    DATA                            │
 └───────────────────────────────────────────────────┘
FieldPurpose
Source PortWhich application on the sender
Destination PortWhich application on the receiver
Sequence NumberPosition of the first byte of this segment's data in the overall stream
Acknowledgment NumberNext byte the receiver expects (confirms receipt of everything before)
FlagsControl bits: SYN (start), ACK (acknowledge), FIN (finish), RST (reset), PSH (push), URG (urgent)
Window SizeHow many bytes the receiver can accept (flow control)
ChecksumError detection for header + data

7. TCP Ports — Finding the Right Application

A single server can run many services (web, email, SSH, database). Ports distinguish them. An IP address finds the machine; a port finds the application on that machine.

  Server IP: 93.184.216.34

  ┌─────────────────────────────────────────────────┐
  │  Port 80   → HTTP (web server)                   │
  │  Port 443  → HTTPS (web server + TLS)            │
  │  Port 22   → SSH (remote access)                 │
  │  Port 25   → SMTP (email sending)                │
  │  Port 5432 → PostgreSQL (database)               │
  │  Port 3000 → Your dev server (custom)            │
  └─────────────────────────────────────────────────┘
Port rangeCategory
0–1023Well-known ports (HTTP, HTTPS, SSH, DNS, etc.) — require privileges
1024–49151Registered ports (databases, app servers)
49152–65535Dynamic / ephemeral — assigned temporarily for client-side connections

When your browser connects to a server on port 443, the browser itself uses a random ephemeral port (e.g. 52347) on your machine. The connection is identified by the 4-tuple: (source IP, source port, destination IP, destination port).


8. Flow Control and Congestion Control

Flow control (receiver-side)

Prevents the sender from overwhelming the receiver's buffer.

  SENDER                                RECEIVER
    │                                      │
    │  "How much can you handle?"          │
    │  ─────────────────────────────────►  │
    │                                      │
    │  "My window is 64KB"                │
    │  ◄─────────────────────────────────  │
    │                                      │
    │  Sends up to 64KB of data            │
    │  ─────────────────────────────────►  │
    │                                      │
    │  "Processed some. Window now 32KB"  │
    │  ◄─────────────────────────────────  │
    │                                      │
    │  Sends up to 32KB more              │

Congestion control (network-side)

Prevents the sender from overwhelming the network itself.

PhaseWhat happens
Slow StartStart small (1–10 segments); double the sending rate each RTT until a threshold
Congestion AvoidanceAfter the threshold, grow the window linearly (additive increase)
Loss detectedPacket loss signals congestion; cut the window (multiplicative decrease)
Fast RetransmitIf 3 duplicate ACKs arrive, retransmit the missing segment immediately (don't wait for timeout)

Modern algorithms like BBR (developed by Google) use bandwidth estimation and RTT measurement rather than relying solely on packet loss as a congestion signal.


9. TCP Connection Lifecycle

Every TCP connection goes through three phases:

  ┌────────────────────┐
  │  1. ESTABLISHMENT  │   Three-way handshake (SYN → SYN-ACK → ACK)
  │     (open)         │   See 1.3.b for details
  └────────┬───────────┘
           │
           ▼
  ┌────────────────────┐
  │  2. DATA TRANSFER  │   Send/receive data with sequence numbers,
  │     (open)         │   ACKs, flow control, congestion control
  └────────┬───────────┘
           │
           ▼
  ┌────────────────────┐
  │  3. TERMINATION    │   Four-way handshake (FIN → ACK → FIN → ACK)
  │     (close)        │   or RST for abrupt close
  └────────────────────┘

Connection termination (four-way close)

  CLIENT                                SERVER
    │                                      │
    │  FIN ──────────────────────────────► │  "I'm done sending"
    │                                      │
    │  ◄────────────────────────────── ACK │  "Got it"
    │                                      │
    │  ◄────────────────────────────── FIN │  "I'm done too"
    │                                      │
    │  ACK ──────────────────────────────► │  "Confirmed"
    │                                      │
    │       Connection fully closed         │

Both sides must agree to close. This ensures all remaining data in transit is delivered before the connection shuts down.


10. Why TCP is Widely Used

ReasonExplanation
Guaranteed deliveryNo silent data loss — critical for web pages, APIs, files, email
Correct orderingHTML must be parsed in order; a file downloaded out of order is corrupt
Battle-tested40+ years of deployment; every OS, every device, every language supports it
Application simplicityDevelopers write to a stream (like writing to a file); TCP handles all the messy packet stuff
Works everywhereFirewalls, NATs, and networks are all designed to handle TCP traffic
Built-in congestion controlTCP doesn't destroy the network under load; it backs off intelligently

11. Where TCP is Used (Real-World Examples)

ApplicationWhy TCP
Web browsing (HTTP/1.1, HTTP/2)Every HTML tag, CSS rule, and script byte must arrive correctly
Email (SMTP, IMAP, POP3)Messages must be complete and unaltered
File transfer (FTP, SFTP, SCP)A corrupt file is useless
SSH (remote terminal)Every keystroke and output must arrive in order
Database connections (PostgreSQL, MySQL)Queries and results must be complete and accurate
APIs (REST, GraphQL, gRPC)Request/response integrity is non-negotiable
Git operations (push, pull, clone)Source code must transfer perfectly

12. Disadvantages of TCP

DisadvantageExplanation
Higher latencyConnection setup (handshake) and acknowledgments add round trips
Head-of-line blockingIf one packet in a stream is lost, all subsequent data waits until it's retransmitted — even if later packets already arrived
Overhead20+ byte header per segment; ACKs consume bandwidth; state tracking uses memory
Not ideal for real-timeRetransmissions can cause jitter (irregular timing) — bad for live audio/video
Slow start penaltyNew connections start with a small window; takes several RTTs to reach full speed

These limitations are exactly why UDP exists — and why QUIC (HTTP/3) was built on UDP to avoid TCP's head-of-line blocking at the transport layer.


13. Key Takeaways

  1. TCP is a connection-oriented, reliable, ordered transport protocol that sits on top of IP.
  2. It guarantees delivery through sequence numbers, acknowledgments, and retransmissions.
  3. Flow control (window size) prevents overwhelming the receiver; congestion control prevents overwhelming the network.
  4. TCP is used by HTTP, email, file transfer, SSH, databases — anything where missing or corrupted data is unacceptable.
  5. The trade-off: TCP adds latency (handshakes, retransmissions, head-of-line blocking) compared to UDP.
  6. Every TCP connection has three phases: establish (handshake), transfer (data), terminate (close).

14. Explain-It Challenge

Without looking back, explain in your own words:

  1. What does "connection-oriented" mean, and why does TCP need it?
  2. How does TCP know a packet was lost, and what does it do about it?
  3. What is flow control and how is it different from congestion control?
  4. Why is TCP a bad choice for live video streaming?
  5. Name four real-world applications that rely on TCP and explain why each one needs reliability.

Navigation: ← 1.3 Overview · 1.3.b — TCP Three-Way Handshake →