Episode 3 — NodeJS MongoDB Backend Architecture / 3.8 — Database Basics MongoDB

3.8.a — Relational vs Non-Relational Databases

Databases are the backbone of every application. Understanding the difference between relational (SQL) and non-relational (NoSQL) databases is the first step toward choosing the right storage layer for your project.

< README | 3.8.b -- Introduction to MongoDB >

What Is a Database?
Relational Databases (SQL)
Non-Relational Databases (NoSQL)
Types of NoSQL Databases
SQL vs NoSQL Comparison
When to Choose SQL vs NoSQL
CAP Theorem Simplified
ACID vs BASE Models
Real-World Usage Examples
Key Takeaways
Explain-It Challenge

1. What Is a Database?

A database is an organized collection of structured data stored electronically. It provides mechanisms to create, read, update, and delete data efficiently.

Without a database, your data lives in memory and vanishes when the server restarts. Databases provide:

Persistence -- data survives restarts and crashes
Organization -- data is structured for efficient retrieval
Concurrency -- multiple users can access data simultaneously
Security -- access control and encryption
Integrity -- rules ensure data stays valid and consistent

Application (Node.js / Express)
        |
        |  <-- queries / commands -->
        |
    Database (MongoDB, PostgreSQL, MySQL, etc.)
        |
    Disk / Cloud Storage

Every application -- from a simple to-do list to a social media platform -- relies on one or more databases.

2. Relational Databases (SQL)

Relational databases store data in tables (rows and columns) with predefined schemas. They use SQL (Structured Query Language) to manage data.

Core Concepts

Concept	Description
Table	A named collection of rows sharing the same columns (like a spreadsheet)
Row (Record)	A single entry in a table
Column (Field)	A specific attribute with a defined data type
Schema	The blueprint defining tables, columns, types, and constraints
Primary Key	A unique identifier for each row (e.g., `id`)
Foreign Key	A reference from one table to another table's primary key
JOIN	Combining data from multiple tables in a single query
Index	Data structure that speeds up queries on specific columns

Example: Users Table

+----+----------+---------------------+-----+
| id | name     | email               | age |
+----+----------+---------------------+-----+
|  1 | Alice    | alice@example.com   |  25 |
|  2 | Bob      | bob@example.com     |  30 |
|  3 | Charlie  | charlie@example.com |  28 |
+----+----------+---------------------+-----+

SQL Query Examples

-- Create a table
CREATE TABLE users (
  id INT PRIMARY KEY AUTO_INCREMENT,
  name VARCHAR(100) NOT NULL,
  email VARCHAR(255) UNIQUE NOT NULL,
  age INT CHECK (age >= 0)
);

-- Insert a row
INSERT INTO users (name, email, age) VALUES ('Alice', 'alice@example.com', 25);

-- Query with filter
SELECT name, email FROM users WHERE age > 20 ORDER BY name;

-- JOIN two tables
SELECT users.name, orders.total
FROM users
JOIN orders ON users.id = orders.user_id
WHERE orders.total > 100;

Popular Relational Databases

Database	Use Case	License
PostgreSQL	Complex queries, ACID compliance, extensions	Open source
MySQL	Web applications, WordPress, read-heavy workloads	Open source
SQLite	Embedded databases, mobile apps, prototyping	Public domain
Microsoft SQL Server	Enterprise Windows environments	Commercial
Oracle DB	Large enterprise systems, banking	Commercial

3. Non-Relational Databases (NoSQL)

Non-relational databases (NoSQL -- "Not Only SQL") store data in formats other than tables. They offer flexible schemas and are designed for horizontal scaling.

Core Concepts

Concept	Description
Document	A self-contained data unit, typically JSON/BSON (MongoDB)
Collection	A group of documents (like a table but schema-free)
Key-Value Pair	Simplest model: a key maps to a value (Redis)
Node / Edge	Graph databases store entities (nodes) and relationships (edges)
Column Family	Groups of columns stored together (Cassandra)

Example: User Document (MongoDB)

{
  "_id": "ObjectId('64a1b2c3d4e5f6a7b8c9d0e1')",
  "name": "Alice",
  "email": "alice@example.com",
  "age": 25,
  "hobbies": ["reading", "hiking", "coding"],
  "address": {
    "city": "San Francisco",
    "state": "CA",
    "zip": "94102"
  }
}

Notice how a single document can hold nested objects and arrays -- something that would require multiple tables and JOINs in SQL.

Popular NoSQL Databases

Database	Type	Use Case
MongoDB	Document	General-purpose, flexible schemas
Redis	Key-Value	Caching, sessions, real-time data
DynamoDB	Key-Value / Document	AWS serverless, auto-scaling
Cassandra	Column-Family	Time-series, massive write throughput
Neo4j	Graph	Social networks, recommendation engines
CouchDB	Document	Offline-first, sync-friendly apps

4. Types of NoSQL Databases

4.1 Document Databases

Store data as JSON-like documents. Each document can have a different structure.

// Document 1
{ "name": "Alice", "age": 25, "hobbies": ["reading"] }

// Document 2 (different fields -- totally valid)
{ "name": "Bob", "role": "admin", "department": "Engineering" }

Examples: MongoDB, CouchDB, Firestore

Best for: Content management, user profiles, catalogs, real-time apps

4.2 Key-Value Databases

The simplest model -- every piece of data is stored as a key mapped to a value.

KEY                VALUE
------------------------------------------
"user:1001"        '{"name":"Alice","age":25}'
"session:abc123"   '{"userId":"1001","expires":"2025-01-01"}'
"cache:homepage"   '<html>...</html>'

Examples: Redis, Memcached, DynamoDB

Best for: Caching, session management, leaderboards, real-time counters

4.3 Column-Family Databases

Data is stored in columns rather than rows. Optimized for reading and writing large amounts of data across many servers.

Row Key: "user:1001"
  Column Family "profile":   { name: "Alice", age: 25 }
  Column Family "activity":  { lastLogin: "2025-03-01", pageViews: 150 }

Examples: Apache Cassandra, HBase, ScyllaDB

Best for: Time-series data, IoT sensor data, analytics at scale

4.4 Graph Databases

Store data as nodes (entities) and edges (relationships). Optimized for traversing relationships.

(Alice) --[FRIENDS_WITH]--> (Bob)
(Alice) --[WORKS_AT]------> (Google)
(Bob)   --[FRIENDS_WITH]--> (Charlie)

Examples: Neo4j, Amazon Neptune, ArangoDB

Best for: Social networks, fraud detection, recommendation engines, knowledge graphs

5. SQL vs NoSQL Comparison

Criteria	SQL (Relational)	NoSQL (Non-Relational)
Data Model	Tables with rows and columns	Documents, key-value, graph, column-family
Schema	Fixed, predefined (rigid)	Flexible, dynamic (schema-less)
Query Language	SQL (standardized)	Database-specific APIs
Scaling	Vertical (bigger server)	Horizontal (more servers)
JOINs	Native, powerful	Limited or manual
ACID Compliance	Strong by default	Varies (eventual consistency common)
Data Integrity	Strict (constraints, foreign keys)	Application-level enforcement
Best For	Complex queries, transactions	Flexible data, high throughput
Schema Changes	Requires migration scripts	Add fields anytime
Maturity	Decades of tooling and knowledge	Rapidly evolving ecosystem
Examples	PostgreSQL, MySQL, SQLite	MongoDB, Redis, Cassandra

6. When to Choose SQL vs NoSQL

Choose SQL When:

Your data has clear, fixed relationships (e.g., banking, e-commerce orders)
You need complex queries with multiple JOINs
Data integrity is critical (ACID transactions)
Your schema is well-defined and unlikely to change drastically
You need strong consistency guarantees

Choose NoSQL When:

Your data structure is evolving or unpredictable
You need horizontal scaling across many servers
You are working with large volumes of unstructured or semi-structured data
Read/write speed matters more than complex queries
You are building a prototype and need to iterate fast

Decision Flowchart

Is your data highly relational with many JOINs?
  |
  YES --> Do you need strict ACID transactions?
  |         |
  |         YES --> SQL (PostgreSQL / MySQL)
  |         NO  --> Consider SQL or Document DB
  |
  NO --> Is your data structure flexible or evolving?
          |
          YES --> Do you need fast key lookups?
          |        |
          |        YES --> Key-Value (Redis / DynamoDB)
          |        NO  --> Document DB (MongoDB)
          |
          NO --> Is your data about relationships/connections?
                   |
                   YES --> Graph DB (Neo4j)
                   NO  --> Column-Family (Cassandra) or Document DB

7. CAP Theorem Simplified

The CAP theorem states that a distributed database can provide only two out of three guarantees simultaneously:

Letter	Property	Meaning
C	Consistency	Every read returns the most recent write
A	Availability	Every request receives a response (no timeouts)
P	Partition Tolerance	System works despite network failures between nodes

Since network partitions will happen in distributed systems, you must choose between C and A:

        Consistency (C)
           /\
          /  \
         /    \
        / CP   \  CA
       /  zone  \ zone
      /          \
     /____________\
  Partition (P) --- Availability (A)
                AP zone

Type	Databases	Trade-off
CP	MongoDB, Redis, HBase	Sacrifices availability during partitions
AP	Cassandra, DynamoDB, CouchDB	Sacrifices consistency (eventual consistency)
CA	Traditional RDBMS (PostgreSQL, MySQL)	Works well on a single node; partitions break them

In practice: MongoDB defaults to strong consistency for reads from the primary node, but offers tunable read preferences for availability.

8. ACID vs BASE Models

ACID (SQL Databases)

Property	Meaning	Example
Atomicity	All operations in a transaction succeed or all fail	Transfer $100: debit AND credit both happen or neither
Consistency	Database moves from one valid state to another	Balance cannot go negative if constraint exists
Isolation	Concurrent transactions do not interfere	Two transfers at the same time produce correct results
Durability	Committed data survives crashes	Data written to disk, not just memory

-- ACID Example: Bank Transfer
BEGIN TRANSACTION;
  UPDATE accounts SET balance = balance - 100 WHERE id = 1;
  UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;
-- If either UPDATE fails, both are rolled back

BASE (NoSQL Databases)

Property	Meaning
Basically Available	System guarantees availability (may return stale data)
Soft State	State may change over time even without input (due to replication)
Eventually Consistent	Data will become consistent across nodes -- eventually

User writes to Node A --> Node A acknowledges immediately
                      --> Node A replicates to Node B (async)
                      --> Node B replicates to Node C (async)
                      --> After a short delay, all nodes agree

Comparison

Property	ACID	BASE
Consistency	Immediate	Eventual
Availability	May block during transactions	Always available
Performance	Slower (locking)	Faster (no locking)
Use Case	Banking, inventory	Social feeds, analytics

9. Real-World Usage Examples

Company	SQL	NoSQL	Why
Instagram	PostgreSQL	Cassandra, Redis	PostgreSQL for user data; Cassandra for feeds; Redis for caching
Uber	PostgreSQL	Cassandra, Redis	PostgreSQL for trips; Cassandra for real-time location data
Netflix	--	Cassandra, DynamoDB	Massive scale requires horizontal scaling across regions
Airbnb	MySQL	Redis, Elasticsearch	MySQL for bookings; Redis for caching; Elasticsearch for search
Twitter/X	MySQL	Redis, Manhattan (custom)	MySQL for tweets; Redis for timelines and caching
Shopify	MySQL	Redis, Memcached	MySQL for orders/products; Redis for sessions and caching

Typical Modern Stack

Frontend (React / Next.js)
    |
    v
API Server (Node.js / Express)
    |
    +--> MongoDB       (primary database -- flexible user/product data)
    +--> Redis         (caching layer -- sessions, frequently accessed data)
    +--> Elasticsearch (search engine -- full-text search)
    +--> PostgreSQL    (analytics -- complex reporting queries)

Key insight: Most production applications use multiple databases (polyglot persistence), choosing the best tool for each specific need.

10. Key Takeaways

A database is an organized, persistent collection of data with CRUD capabilities.
Relational (SQL) databases use tables with fixed schemas and SQL for querying.
Non-relational (NoSQL) databases offer flexible schemas with various data models.
Four types of NoSQL: document, key-value, column-family, and graph.
SQL excels at complex queries, JOINs, and strict data integrity.
NoSQL excels at flexible schemas, horizontal scaling, and high throughput.
The CAP theorem means you trade off between consistency and availability in distributed systems.
ACID guarantees strict transactions; BASE trades consistency for availability.
Real-world applications often use multiple database types together.

11. Explain-It Challenge

Imagine you are explaining to a friend who has never coded before: What is the difference between a SQL and a NoSQL database? Use an analogy they would understand (e.g., a filing cabinet vs a storage bin, or a spreadsheet vs a folder of sticky notes). Explain why some companies use both types at the same time.

< README | 3.8.b -- Introduction to MongoDB >

Cohart

3.8.a — Relational vs Non-Relational Databases

Table of Contents

1. What Is a Database?

2. Relational Databases (SQL)

Core Concepts

Example: Users Table

SQL Query Examples

Popular Relational Databases

3. Non-Relational Databases (NoSQL)

Core Concepts

Example: User Document (MongoDB)

Popular NoSQL Databases

4. Types of NoSQL Databases

4.1 Document Databases

4.2 Key-Value Databases

4.3 Column-Family Databases

4.4 Graph Databases

5. SQL vs NoSQL Comparison

6. When to Choose SQL vs NoSQL

Choose SQL When:

Choose NoSQL When:

Decision Flowchart

7. CAP Theorem Simplified

8. ACID vs BASE Models

ACID (SQL Databases)

BASE (NoSQL Databases)

Comparison

9. Real-World Usage Examples

Typical Modern Stack

10. Key Takeaways

11. Explain-It Challenge