Episode 3 — NodeJS MongoDB Backend Architecture / 3.8 — Database Basics MongoDB
3.8.a — Relational vs Non-Relational Databases
Databases are the backbone of every application. Understanding the difference between relational (SQL) and non-relational (NoSQL) databases is the first step toward choosing the right storage layer for your project.
< README | 3.8.b -- Introduction to MongoDB >
Table of Contents
- What Is a Database?
- Relational Databases (SQL)
- Non-Relational Databases (NoSQL)
- Types of NoSQL Databases
- SQL vs NoSQL Comparison
- When to Choose SQL vs NoSQL
- CAP Theorem Simplified
- ACID vs BASE Models
- Real-World Usage Examples
- Key Takeaways
- Explain-It Challenge
1. What Is a Database?
A database is an organized collection of structured data stored electronically. It provides mechanisms to create, read, update, and delete data efficiently.
Without a database, your data lives in memory and vanishes when the server restarts. Databases provide:
- Persistence -- data survives restarts and crashes
- Organization -- data is structured for efficient retrieval
- Concurrency -- multiple users can access data simultaneously
- Security -- access control and encryption
- Integrity -- rules ensure data stays valid and consistent
Application (Node.js / Express)
|
| <-- queries / commands -->
|
Database (MongoDB, PostgreSQL, MySQL, etc.)
|
Disk / Cloud Storage
Every application -- from a simple to-do list to a social media platform -- relies on one or more databases.
2. Relational Databases (SQL)
Relational databases store data in tables (rows and columns) with predefined schemas. They use SQL (Structured Query Language) to manage data.
Core Concepts
| Concept | Description |
|---|---|
| Table | A named collection of rows sharing the same columns (like a spreadsheet) |
| Row (Record) | A single entry in a table |
| Column (Field) | A specific attribute with a defined data type |
| Schema | The blueprint defining tables, columns, types, and constraints |
| Primary Key | A unique identifier for each row (e.g., id) |
| Foreign Key | A reference from one table to another table's primary key |
| JOIN | Combining data from multiple tables in a single query |
| Index | Data structure that speeds up queries on specific columns |
Example: Users Table
+----+----------+---------------------+-----+
| id | name | email | age |
+----+----------+---------------------+-----+
| 1 | Alice | alice@example.com | 25 |
| 2 | Bob | bob@example.com | 30 |
| 3 | Charlie | charlie@example.com | 28 |
+----+----------+---------------------+-----+
SQL Query Examples
-- Create a table
CREATE TABLE users (
id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(100) NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
age INT CHECK (age >= 0)
);
-- Insert a row
INSERT INTO users (name, email, age) VALUES ('Alice', 'alice@example.com', 25);
-- Query with filter
SELECT name, email FROM users WHERE age > 20 ORDER BY name;
-- JOIN two tables
SELECT users.name, orders.total
FROM users
JOIN orders ON users.id = orders.user_id
WHERE orders.total > 100;
Popular Relational Databases
| Database | Use Case | License |
|---|---|---|
| PostgreSQL | Complex queries, ACID compliance, extensions | Open source |
| MySQL | Web applications, WordPress, read-heavy workloads | Open source |
| SQLite | Embedded databases, mobile apps, prototyping | Public domain |
| Microsoft SQL Server | Enterprise Windows environments | Commercial |
| Oracle DB | Large enterprise systems, banking | Commercial |
3. Non-Relational Databases (NoSQL)
Non-relational databases (NoSQL -- "Not Only SQL") store data in formats other than tables. They offer flexible schemas and are designed for horizontal scaling.
Core Concepts
| Concept | Description |
|---|---|
| Document | A self-contained data unit, typically JSON/BSON (MongoDB) |
| Collection | A group of documents (like a table but schema-free) |
| Key-Value Pair | Simplest model: a key maps to a value (Redis) |
| Node / Edge | Graph databases store entities (nodes) and relationships (edges) |
| Column Family | Groups of columns stored together (Cassandra) |
Example: User Document (MongoDB)
{
"_id": "ObjectId('64a1b2c3d4e5f6a7b8c9d0e1')",
"name": "Alice",
"email": "alice@example.com",
"age": 25,
"hobbies": ["reading", "hiking", "coding"],
"address": {
"city": "San Francisco",
"state": "CA",
"zip": "94102"
}
}
Notice how a single document can hold nested objects and arrays -- something that would require multiple tables and JOINs in SQL.
Popular NoSQL Databases
| Database | Type | Use Case |
|---|---|---|
| MongoDB | Document | General-purpose, flexible schemas |
| Redis | Key-Value | Caching, sessions, real-time data |
| DynamoDB | Key-Value / Document | AWS serverless, auto-scaling |
| Cassandra | Column-Family | Time-series, massive write throughput |
| Neo4j | Graph | Social networks, recommendation engines |
| CouchDB | Document | Offline-first, sync-friendly apps |
4. Types of NoSQL Databases
4.1 Document Databases
Store data as JSON-like documents. Each document can have a different structure.
// Document 1
{ "name": "Alice", "age": 25, "hobbies": ["reading"] }
// Document 2 (different fields -- totally valid)
{ "name": "Bob", "role": "admin", "department": "Engineering" }
Examples: MongoDB, CouchDB, Firestore
Best for: Content management, user profiles, catalogs, real-time apps
4.2 Key-Value Databases
The simplest model -- every piece of data is stored as a key mapped to a value.
KEY VALUE
------------------------------------------
"user:1001" '{"name":"Alice","age":25}'
"session:abc123" '{"userId":"1001","expires":"2025-01-01"}'
"cache:homepage" '<html>...</html>'
Examples: Redis, Memcached, DynamoDB
Best for: Caching, session management, leaderboards, real-time counters
4.3 Column-Family Databases
Data is stored in columns rather than rows. Optimized for reading and writing large amounts of data across many servers.
Row Key: "user:1001"
Column Family "profile": { name: "Alice", age: 25 }
Column Family "activity": { lastLogin: "2025-03-01", pageViews: 150 }
Examples: Apache Cassandra, HBase, ScyllaDB
Best for: Time-series data, IoT sensor data, analytics at scale
4.4 Graph Databases
Store data as nodes (entities) and edges (relationships). Optimized for traversing relationships.
(Alice) --[FRIENDS_WITH]--> (Bob)
(Alice) --[WORKS_AT]------> (Google)
(Bob) --[FRIENDS_WITH]--> (Charlie)
Examples: Neo4j, Amazon Neptune, ArangoDB
Best for: Social networks, fraud detection, recommendation engines, knowledge graphs
5. SQL vs NoSQL Comparison
| Criteria | SQL (Relational) | NoSQL (Non-Relational) |
|---|---|---|
| Data Model | Tables with rows and columns | Documents, key-value, graph, column-family |
| Schema | Fixed, predefined (rigid) | Flexible, dynamic (schema-less) |
| Query Language | SQL (standardized) | Database-specific APIs |
| Scaling | Vertical (bigger server) | Horizontal (more servers) |
| JOINs | Native, powerful | Limited or manual |
| ACID Compliance | Strong by default | Varies (eventual consistency common) |
| Data Integrity | Strict (constraints, foreign keys) | Application-level enforcement |
| Best For | Complex queries, transactions | Flexible data, high throughput |
| Schema Changes | Requires migration scripts | Add fields anytime |
| Maturity | Decades of tooling and knowledge | Rapidly evolving ecosystem |
| Examples | PostgreSQL, MySQL, SQLite | MongoDB, Redis, Cassandra |
6. When to Choose SQL vs NoSQL
Choose SQL When:
- Your data has clear, fixed relationships (e.g., banking, e-commerce orders)
- You need complex queries with multiple JOINs
- Data integrity is critical (ACID transactions)
- Your schema is well-defined and unlikely to change drastically
- You need strong consistency guarantees
Choose NoSQL When:
- Your data structure is evolving or unpredictable
- You need horizontal scaling across many servers
- You are working with large volumes of unstructured or semi-structured data
- Read/write speed matters more than complex queries
- You are building a prototype and need to iterate fast
Decision Flowchart
Is your data highly relational with many JOINs?
|
YES --> Do you need strict ACID transactions?
| |
| YES --> SQL (PostgreSQL / MySQL)
| NO --> Consider SQL or Document DB
|
NO --> Is your data structure flexible or evolving?
|
YES --> Do you need fast key lookups?
| |
| YES --> Key-Value (Redis / DynamoDB)
| NO --> Document DB (MongoDB)
|
NO --> Is your data about relationships/connections?
|
YES --> Graph DB (Neo4j)
NO --> Column-Family (Cassandra) or Document DB
7. CAP Theorem Simplified
The CAP theorem states that a distributed database can provide only two out of three guarantees simultaneously:
| Letter | Property | Meaning |
|---|---|---|
| C | Consistency | Every read returns the most recent write |
| A | Availability | Every request receives a response (no timeouts) |
| P | Partition Tolerance | System works despite network failures between nodes |
Since network partitions will happen in distributed systems, you must choose between C and A:
Consistency (C)
/\
/ \
/ \
/ CP \ CA
/ zone \ zone
/ \
/____________\
Partition (P) --- Availability (A)
AP zone
| Type | Databases | Trade-off |
|---|---|---|
| CP | MongoDB, Redis, HBase | Sacrifices availability during partitions |
| AP | Cassandra, DynamoDB, CouchDB | Sacrifices consistency (eventual consistency) |
| CA | Traditional RDBMS (PostgreSQL, MySQL) | Works well on a single node; partitions break them |
In practice: MongoDB defaults to strong consistency for reads from the primary node, but offers tunable read preferences for availability.
8. ACID vs BASE Models
ACID (SQL Databases)
| Property | Meaning | Example |
|---|---|---|
| Atomicity | All operations in a transaction succeed or all fail | Transfer $100: debit AND credit both happen or neither |
| Consistency | Database moves from one valid state to another | Balance cannot go negative if constraint exists |
| Isolation | Concurrent transactions do not interfere | Two transfers at the same time produce correct results |
| Durability | Committed data survives crashes | Data written to disk, not just memory |
-- ACID Example: Bank Transfer
BEGIN TRANSACTION;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;
-- If either UPDATE fails, both are rolled back
BASE (NoSQL Databases)
| Property | Meaning |
|---|---|
| Basically Available | System guarantees availability (may return stale data) |
| Soft State | State may change over time even without input (due to replication) |
| Eventually Consistent | Data will become consistent across nodes -- eventually |
User writes to Node A --> Node A acknowledges immediately
--> Node A replicates to Node B (async)
--> Node B replicates to Node C (async)
--> After a short delay, all nodes agree
Comparison
| Property | ACID | BASE |
|---|---|---|
| Consistency | Immediate | Eventual |
| Availability | May block during transactions | Always available |
| Performance | Slower (locking) | Faster (no locking) |
| Use Case | Banking, inventory | Social feeds, analytics |
9. Real-World Usage Examples
| Company | SQL | NoSQL | Why |
|---|---|---|---|
| PostgreSQL | Cassandra, Redis | PostgreSQL for user data; Cassandra for feeds; Redis for caching | |
| Uber | PostgreSQL | Cassandra, Redis | PostgreSQL for trips; Cassandra for real-time location data |
| Netflix | -- | Cassandra, DynamoDB | Massive scale requires horizontal scaling across regions |
| Airbnb | MySQL | Redis, Elasticsearch | MySQL for bookings; Redis for caching; Elasticsearch for search |
| Twitter/X | MySQL | Redis, Manhattan (custom) | MySQL for tweets; Redis for timelines and caching |
| Shopify | MySQL | Redis, Memcached | MySQL for orders/products; Redis for sessions and caching |
Typical Modern Stack
Frontend (React / Next.js)
|
v
API Server (Node.js / Express)
|
+--> MongoDB (primary database -- flexible user/product data)
+--> Redis (caching layer -- sessions, frequently accessed data)
+--> Elasticsearch (search engine -- full-text search)
+--> PostgreSQL (analytics -- complex reporting queries)
Key insight: Most production applications use multiple databases (polyglot persistence), choosing the best tool for each specific need.
10. Key Takeaways
- A database is an organized, persistent collection of data with CRUD capabilities.
- Relational (SQL) databases use tables with fixed schemas and SQL for querying.
- Non-relational (NoSQL) databases offer flexible schemas with various data models.
- Four types of NoSQL: document, key-value, column-family, and graph.
- SQL excels at complex queries, JOINs, and strict data integrity.
- NoSQL excels at flexible schemas, horizontal scaling, and high throughput.
- The CAP theorem means you trade off between consistency and availability in distributed systems.
- ACID guarantees strict transactions; BASE trades consistency for availability.
- Real-world applications often use multiple database types together.
11. Explain-It Challenge
Imagine you are explaining to a friend who has never coded before: What is the difference between a SQL and a NoSQL database? Use an analogy they would understand (e.g., a filing cabinet vs a storage bin, or a spreadsheet vs a folder of sticky notes). Explain why some companies use both types at the same time.