Episode 3 — NodeJS MongoDB Backend Architecture / 3.8 — Database Basics MongoDB
3.8.b — Introduction to MongoDB
MongoDB is a document-oriented NoSQL database that stores data in flexible, JSON-like documents. It is the "M" in the MERN stack and one of the most popular databases for JavaScript developers.
< 3.8.a -- Relational vs Non-Relational | 3.8.c -- Setting Up MongoDB >
Table of Contents
- What Is MongoDB?
- JSON-Like Documents (BSON)
- Database Hierarchy
- Schema-less / Flexible Schema
- MongoDB vs SQL Terminology
- Key MongoDB Features
- When MongoDB Shines
- When NOT to Use MongoDB
- MongoDB Editions
- Brief History and Ecosystem
- Key Takeaways
- Explain-It Challenge
1. What Is MongoDB?
MongoDB is an open-source, cross-platform, document-oriented NoSQL database. Instead of storing data in rows and columns like a traditional relational database, MongoDB stores data as documents -- flexible, JSON-like data structures.
Traditional SQL:
+----+----------+---------------------+-----+
| id | name | email | age |
+----+----------+---------------------+-----+
| 1 | Alice | alice@example.com | 25 |
+----+----------+---------------------+-----+
MongoDB Document:
{
"_id": ObjectId("64a1b2c3d4e5f6a7b8c9d0e1"),
"name": "Alice",
"email": "alice@example.com",
"age": 25,
"hobbies": ["reading", "hiking"],
"address": {
"city": "San Francisco",
"state": "CA"
}
}
Key characteristics:
- Document-oriented -- data is stored as documents, not rows
- Schema-flexible -- documents in the same collection can have different fields
- High performance -- optimized for read/write speed
- Horizontally scalable -- add more servers instead of upgrading one
- Developer-friendly -- works naturally with JavaScript and JSON
2. JSON-Like Documents (BSON)
MongoDB documents look like JSON but are stored internally as BSON (Binary JSON).
JSON vs BSON
| Feature | JSON | BSON |
|---|---|---|
| Format | Text-based | Binary-encoded |
| Data Types | String, Number, Boolean, Array, Object, null | All JSON types + Date, ObjectId, Int32, Int64, Decimal128, Binary, Regex |
| Size | Larger (text) | Smaller (binary compression) |
| Speed | Slower to parse | Faster to parse |
| Use | Data interchange (APIs) | Internal MongoDB storage |
Why BSON?
JSON (what you write): BSON (how MongoDB stores it):
{ \x48\x00\x00\x00 (document size)
"name": "Alice", \x02 name\x00 ... (string type)
"age": 25, \x10 age\x00 \x19\x00... (int32 type)
"joined": "2025-01-15" \x09 joined\x00 ... (date type)
} \x00 (end marker)
BSON gives MongoDB:
- Richer data types (Date, ObjectId, Binary, Decimal128)
- Faster traversal (binary encoding allows skipping fields)
- Lightweight (binary is more compact than text)
In practice: You write JSON, MongoDB converts it to BSON automatically. You never need to manually encode or decode BSON.
3. Database Hierarchy
MongoDB organizes data in a three-level hierarchy:
MongoDB Server
|
+-- Database: "myapp"
| |
| +-- Collection: "users"
| | +-- Document: { name: "Alice", ... }
| | +-- Document: { name: "Bob", ... }
| | +-- Document: { name: "Charlie", ... }
| |
| +-- Collection: "products"
| | +-- Document: { title: "Widget", ... }
| | +-- Document: { title: "Gadget", ... }
| |
| +-- Collection: "orders"
| +-- Document: { userId: "...", total: 59.99 }
|
+-- Database: "admin" (system)
+-- Database: "local" (system)
| Level | SQL Equivalent | Description |
|---|---|---|
| Database | Database | Top-level container; an app typically has one |
| Collection | Table | A grouping of related documents |
| Document | Row | A single data record (JSON-like object) |
| Field | Column | A key-value pair within a document |
Key Differences from SQL
- A collection does not enforce a fixed structure -- documents can vary
- A document can contain nested objects and arrays (no need for JOINs)
- You do not need to create a database or collection explicitly -- they are created automatically when you first insert data
// This creates the "myapp" database AND "users" collection automatically
db.users.insertOne({ name: "Alice", age: 25 });
4. Schema-less / Flexible Schema
One of MongoDB's most powerful features is its flexible schema. Documents in the same collection do not need to share the same structure.
// Document 1: A regular user
{
"_id": ObjectId("..."),
"name": "Alice",
"email": "alice@example.com",
"age": 25
}
// Document 2: An admin user with extra fields
{
"_id": ObjectId("..."),
"name": "Bob",
"email": "bob@example.com",
"role": "admin",
"permissions": ["read", "write", "delete"],
"department": "Engineering"
}
// Document 3: A minimal user
{
"_id": ObjectId("..."),
"name": "Charlie"
}
All three documents live in the same collection -- perfectly valid.
Pros of Flexible Schema
- Fast iteration -- add or remove fields without migration scripts
- Natural modeling -- different document shapes for different use cases
- No downtime -- schema changes do not require altering tables
- Polymorphic data -- store different "types" in the same collection
Cons of Flexible Schema
- No built-in enforcement -- application must ensure data consistency
- Inconsistent data -- without discipline, documents can become messy
- Query complexity -- querying fields that may or may not exist
Best practice: Use Mongoose (covered in 3.8.e) to enforce schemas at the application level while keeping MongoDB's flexibility for evolution.
5. MongoDB vs SQL Terminology
| SQL Concept | MongoDB Equivalent | Notes |
|---|---|---|
| Database | Database | Same concept |
| Table | Collection | No fixed schema |
| Row | Document | JSON-like, can be nested |
| Column | Field | Can hold any type, including arrays/objects |
Primary Key (id) | _id (ObjectId) | Auto-generated, globally unique |
| Foreign Key | Manual reference (ObjectId) | No built-in enforcement |
| JOIN | $lookup / Mongoose populate() | Less common, documents embed related data |
| INDEX | Index | Similar concept, same performance benefits |
| VIEW | View | Aggregation-based read-only collections |
| Transaction | Multi-document Transaction | Supported since MongoDB 4.0 |
| Schema / DDL | Validation rules (optional) | Application-level via Mongoose |
6. Key MongoDB Features
6.1 Horizontal Scaling (Sharding)
MongoDB distributes data across multiple servers (shards) automatically.
Client Request
|
mongos (Router)
|
+--------+--------+--------+
| Shard 1 | Shard 2 | Shard 3 |
| A-G | H-N | O-Z |
+---------+---------+---------+
6.2 Replication (High Availability)
MongoDB replicates data across multiple servers in a replica set for fault tolerance.
Primary Node (reads + writes)
|
+--> Secondary Node 1 (read replicas)
+--> Secondary Node 2 (read replicas)
If Primary fails --> automatic failover to a Secondary
6.3 Aggregation Framework
A powerful pipeline for data transformations and analytics.
db.orders.aggregate([
{ $match: { status: "completed" } },
{ $group: { _id: "$customerId", totalSpent: { $sum: "$amount" } } },
{ $sort: { totalSpent: -1 } },
{ $limit: 10 }
]);
6.4 Indexing
Indexes speed up queries dramatically.
// Create an index on the email field
db.users.createIndex({ email: 1 });
// Compound index
db.orders.createIndex({ userId: 1, createdAt: -1 });
// Text index for search
db.articles.createIndex({ title: "text", body: "text" });
6.5 Change Streams
Watch for real-time changes to your data.
const changeStream = db.collection("orders").watch();
changeStream.on("change", (change) => {
console.log("Order changed:", change);
});
7. When MongoDB Shines
| Use Case | Why MongoDB Excels |
|---|---|
| Rapid prototyping | No schema setup needed; start inserting data immediately |
| Content management | Articles, blog posts, media with varying metadata |
| User profiles | Different users may have different fields |
| Real-time analytics | Aggregation framework + horizontal scaling |
| IoT / Sensor data | High write throughput, flexible document structure |
| Mobile backends | MongoDB Realm provides offline-first sync |
| E-commerce catalogs | Products with vastly different attributes |
| Logging and events | Append-heavy workloads with capped collections |
8. When NOT to Use MongoDB
| Scenario | Why MongoDB May Struggle | Better Alternative |
|---|---|---|
| Complex transactions | Multi-document transactions are supported but add overhead | PostgreSQL, MySQL |
| Heavy JOIN operations | No native JOINs; $lookup is less efficient | PostgreSQL |
| Strict data integrity | No foreign key enforcement at the database level | PostgreSQL, MySQL |
| Financial ledgers | Requires absolute ACID compliance | PostgreSQL |
| Deeply relational data | Multiple levels of relationships are awkward | PostgreSQL, Neo4j |
| Small, fixed-schema data | Overhead of BSON is unnecessary | SQLite, PostgreSQL |
Rule of thumb: If your data naturally fits into a spreadsheet with rigid columns and many cross-references, SQL may be better. If your data is naturally nested, varied, or document-like, MongoDB is likely a great fit.
9. MongoDB Editions
| Edition | Cost | Best For | Key Features |
|---|---|---|---|
| Community Server | Free | Learning, development, small projects | Full database engine, mongosh, local hosting |
| MongoDB Atlas | Free tier + paid tiers | Production cloud deployments | Managed service, auto-scaling, backups, monitoring |
| Enterprise Server | Paid license | Large organizations | LDAP auth, auditing, encryption at rest, support |
MongoDB Atlas (Recommended for Learning)
Atlas provides a free tier (M0) with:
- 512 MB storage
- Shared cluster
- Connection from anywhere
- Built-in monitoring
- No credit card required
Atlas Free Tier Limits:
- 512 MB storage
- Shared RAM and vCPU
- 100 max connections
- No backups (free tier)
- 3 replica set nodes
10. Brief History and Ecosystem
Timeline
| Year | Milestone |
|---|---|
| 2007 | Development begins at 10gen (now MongoDB, Inc.) |
| 2009 | First public release (v1.0) |
| 2013 | MongoDB 2.4 -- text search, hashed indexes |
| 2015 | MongoDB 3.0 -- WiredTiger storage engine (major performance boost) |
| 2017 | MongoDB IPO on NASDAQ |
| 2018 | MongoDB 4.0 -- multi-document ACID transactions |
| 2020 | MongoDB 4.4 -- refinable shard keys, hedged reads |
| 2022 | MongoDB 6.0 -- queryable encryption, time-series improvements |
| 2023 | MongoDB 7.0 -- metadata encryption, improved sharding |
MongoDB Ecosystem
MongoDB Ecosystem:
|
+-- mongosh (CLI shell)
+-- Compass (GUI client)
+-- Atlas (Cloud platform)
+-- Realm (Mobile database + sync)
+-- Charts (Data visualization)
+-- Connectors (Spark, Kafka, BI tools)
+-- Drivers (Node.js, Python, Java, Go, C#, Ruby, PHP, ...)
+-- Mongoose (Node.js ODM -- covered in 3.8.e)
+-- Atlas Search (Full-text search powered by Lucene)
+-- Atlas Data Lake (Query data in S3 with MQL)
The MongoDB Driver for Node.js
// Native MongoDB driver (low-level)
const { MongoClient } = require('mongodb');
const client = new MongoClient('mongodb://localhost:27017');
await client.connect();
const db = client.db('myapp');
const users = db.collection('users');
await users.insertOne({ name: 'Alice', age: 25 });
const user = await users.findOne({ name: 'Alice' });
await client.close();
In this course, we primarily use Mongoose (an abstraction layer on top of the native driver) for its schema validation, middleware, and developer-friendly API.
11. Key Takeaways
- MongoDB is a document-oriented NoSQL database that stores data in flexible, JSON-like (BSON) documents.
- The hierarchy is: Database > Collection > Document > Field.
- MongoDB is schema-flexible -- documents in the same collection can have different fields.
- Key features include horizontal scaling (sharding), replication, aggregation, and indexing.
- MongoDB excels at rapid prototyping, flexible schemas, and high-throughput workloads.
- Avoid MongoDB for complex transactions, heavy JOINs, or strict relational data.
- MongoDB Atlas offers a free cloud tier perfect for learning and small projects.
- The Mongoose ODM adds schema enforcement and developer convenience to MongoDB in Node.js.
12. Explain-It Challenge
You are building a new social media app. Your CTO asks: "Should we use MongoDB or PostgreSQL for our main database?" Write a short analysis covering:
- What type of data the app will store (users, posts, comments, likes)
- Which data is relational and which is document-like
- Your recommendation and reasoning
- Whether you might use both (polyglot persistence)
< 3.8.a -- Relational vs Non-Relational | 3.8.c -- Setting Up MongoDB >