Episode 3 — NodeJS MongoDB Backend Architecture / 3.8 — Database Basics MongoDB

3.8.b — Introduction to MongoDB

MongoDB is a document-oriented NoSQL database that stores data in flexible, JSON-like documents. It is the "M" in the MERN stack and one of the most popular databases for JavaScript developers.


< 3.8.a -- Relational vs Non-Relational | 3.8.c -- Setting Up MongoDB >


Table of Contents

  1. What Is MongoDB?
  2. JSON-Like Documents (BSON)
  3. Database Hierarchy
  4. Schema-less / Flexible Schema
  5. MongoDB vs SQL Terminology
  6. Key MongoDB Features
  7. When MongoDB Shines
  8. When NOT to Use MongoDB
  9. MongoDB Editions
  10. Brief History and Ecosystem
  11. Key Takeaways
  12. Explain-It Challenge

1. What Is MongoDB?

MongoDB is an open-source, cross-platform, document-oriented NoSQL database. Instead of storing data in rows and columns like a traditional relational database, MongoDB stores data as documents -- flexible, JSON-like data structures.

Traditional SQL:
+----+----------+---------------------+-----+
| id | name     | email               | age |
+----+----------+---------------------+-----+
|  1 | Alice    | alice@example.com   |  25 |
+----+----------+---------------------+-----+

MongoDB Document:
{
  "_id": ObjectId("64a1b2c3d4e5f6a7b8c9d0e1"),
  "name": "Alice",
  "email": "alice@example.com",
  "age": 25,
  "hobbies": ["reading", "hiking"],
  "address": {
    "city": "San Francisco",
    "state": "CA"
  }
}

Key characteristics:

  • Document-oriented -- data is stored as documents, not rows
  • Schema-flexible -- documents in the same collection can have different fields
  • High performance -- optimized for read/write speed
  • Horizontally scalable -- add more servers instead of upgrading one
  • Developer-friendly -- works naturally with JavaScript and JSON

2. JSON-Like Documents (BSON)

MongoDB documents look like JSON but are stored internally as BSON (Binary JSON).

JSON vs BSON

FeatureJSONBSON
FormatText-basedBinary-encoded
Data TypesString, Number, Boolean, Array, Object, nullAll JSON types + Date, ObjectId, Int32, Int64, Decimal128, Binary, Regex
SizeLarger (text)Smaller (binary compression)
SpeedSlower to parseFaster to parse
UseData interchange (APIs)Internal MongoDB storage

Why BSON?

JSON (what you write):          BSON (how MongoDB stores it):
{                               \x48\x00\x00\x00          (document size)
  "name": "Alice",              \x02 name\x00 ...          (string type)
  "age": 25,                    \x10 age\x00 \x19\x00...   (int32 type)
  "joined": "2025-01-15"        \x09 joined\x00 ...        (date type)
}                               \x00                        (end marker)

BSON gives MongoDB:

  • Richer data types (Date, ObjectId, Binary, Decimal128)
  • Faster traversal (binary encoding allows skipping fields)
  • Lightweight (binary is more compact than text)

In practice: You write JSON, MongoDB converts it to BSON automatically. You never need to manually encode or decode BSON.


3. Database Hierarchy

MongoDB organizes data in a three-level hierarchy:

MongoDB Server
  |
  +-- Database: "myapp"
  |     |
  |     +-- Collection: "users"
  |     |     +-- Document: { name: "Alice", ... }
  |     |     +-- Document: { name: "Bob", ... }
  |     |     +-- Document: { name: "Charlie", ... }
  |     |
  |     +-- Collection: "products"
  |     |     +-- Document: { title: "Widget", ... }
  |     |     +-- Document: { title: "Gadget", ... }
  |     |
  |     +-- Collection: "orders"
  |           +-- Document: { userId: "...", total: 59.99 }
  |
  +-- Database: "admin" (system)
  +-- Database: "local" (system)
LevelSQL EquivalentDescription
DatabaseDatabaseTop-level container; an app typically has one
CollectionTableA grouping of related documents
DocumentRowA single data record (JSON-like object)
FieldColumnA key-value pair within a document

Key Differences from SQL

  • A collection does not enforce a fixed structure -- documents can vary
  • A document can contain nested objects and arrays (no need for JOINs)
  • You do not need to create a database or collection explicitly -- they are created automatically when you first insert data
// This creates the "myapp" database AND "users" collection automatically
db.users.insertOne({ name: "Alice", age: 25 });

4. Schema-less / Flexible Schema

One of MongoDB's most powerful features is its flexible schema. Documents in the same collection do not need to share the same structure.

// Document 1: A regular user
{
  "_id": ObjectId("..."),
  "name": "Alice",
  "email": "alice@example.com",
  "age": 25
}

// Document 2: An admin user with extra fields
{
  "_id": ObjectId("..."),
  "name": "Bob",
  "email": "bob@example.com",
  "role": "admin",
  "permissions": ["read", "write", "delete"],
  "department": "Engineering"
}

// Document 3: A minimal user
{
  "_id": ObjectId("..."),
  "name": "Charlie"
}

All three documents live in the same collection -- perfectly valid.

Pros of Flexible Schema

  • Fast iteration -- add or remove fields without migration scripts
  • Natural modeling -- different document shapes for different use cases
  • No downtime -- schema changes do not require altering tables
  • Polymorphic data -- store different "types" in the same collection

Cons of Flexible Schema

  • No built-in enforcement -- application must ensure data consistency
  • Inconsistent data -- without discipline, documents can become messy
  • Query complexity -- querying fields that may or may not exist

Best practice: Use Mongoose (covered in 3.8.e) to enforce schemas at the application level while keeping MongoDB's flexibility for evolution.


5. MongoDB vs SQL Terminology

SQL ConceptMongoDB EquivalentNotes
DatabaseDatabaseSame concept
TableCollectionNo fixed schema
RowDocumentJSON-like, can be nested
ColumnFieldCan hold any type, including arrays/objects
Primary Key (id)_id (ObjectId)Auto-generated, globally unique
Foreign KeyManual reference (ObjectId)No built-in enforcement
JOIN$lookup / Mongoose populate()Less common, documents embed related data
INDEXIndexSimilar concept, same performance benefits
VIEWViewAggregation-based read-only collections
TransactionMulti-document TransactionSupported since MongoDB 4.0
Schema / DDLValidation rules (optional)Application-level via Mongoose

6. Key MongoDB Features

6.1 Horizontal Scaling (Sharding)

MongoDB distributes data across multiple servers (shards) automatically.

Client Request
     |
  mongos (Router)
     |
  +--------+--------+--------+
  | Shard 1 | Shard 2 | Shard 3 |
  | A-G     | H-N     | O-Z     |
  +---------+---------+---------+

6.2 Replication (High Availability)

MongoDB replicates data across multiple servers in a replica set for fault tolerance.

Primary Node (reads + writes)
     |
     +--> Secondary Node 1 (read replicas)
     +--> Secondary Node 2 (read replicas)

If Primary fails --> automatic failover to a Secondary

6.3 Aggregation Framework

A powerful pipeline for data transformations and analytics.

db.orders.aggregate([
  { $match: { status: "completed" } },
  { $group: { _id: "$customerId", totalSpent: { $sum: "$amount" } } },
  { $sort: { totalSpent: -1 } },
  { $limit: 10 }
]);

6.4 Indexing

Indexes speed up queries dramatically.

// Create an index on the email field
db.users.createIndex({ email: 1 });

// Compound index
db.orders.createIndex({ userId: 1, createdAt: -1 });

// Text index for search
db.articles.createIndex({ title: "text", body: "text" });

6.5 Change Streams

Watch for real-time changes to your data.

const changeStream = db.collection("orders").watch();
changeStream.on("change", (change) => {
  console.log("Order changed:", change);
});

7. When MongoDB Shines

Use CaseWhy MongoDB Excels
Rapid prototypingNo schema setup needed; start inserting data immediately
Content managementArticles, blog posts, media with varying metadata
User profilesDifferent users may have different fields
Real-time analyticsAggregation framework + horizontal scaling
IoT / Sensor dataHigh write throughput, flexible document structure
Mobile backendsMongoDB Realm provides offline-first sync
E-commerce catalogsProducts with vastly different attributes
Logging and eventsAppend-heavy workloads with capped collections

8. When NOT to Use MongoDB

ScenarioWhy MongoDB May StruggleBetter Alternative
Complex transactionsMulti-document transactions are supported but add overheadPostgreSQL, MySQL
Heavy JOIN operationsNo native JOINs; $lookup is less efficientPostgreSQL
Strict data integrityNo foreign key enforcement at the database levelPostgreSQL, MySQL
Financial ledgersRequires absolute ACID compliancePostgreSQL
Deeply relational dataMultiple levels of relationships are awkwardPostgreSQL, Neo4j
Small, fixed-schema dataOverhead of BSON is unnecessarySQLite, PostgreSQL

Rule of thumb: If your data naturally fits into a spreadsheet with rigid columns and many cross-references, SQL may be better. If your data is naturally nested, varied, or document-like, MongoDB is likely a great fit.


9. MongoDB Editions

EditionCostBest ForKey Features
Community ServerFreeLearning, development, small projectsFull database engine, mongosh, local hosting
MongoDB AtlasFree tier + paid tiersProduction cloud deploymentsManaged service, auto-scaling, backups, monitoring
Enterprise ServerPaid licenseLarge organizationsLDAP auth, auditing, encryption at rest, support

MongoDB Atlas (Recommended for Learning)

Atlas provides a free tier (M0) with:

  • 512 MB storage
  • Shared cluster
  • Connection from anywhere
  • Built-in monitoring
  • No credit card required
Atlas Free Tier Limits:
  - 512 MB storage
  - Shared RAM and vCPU
  - 100 max connections
  - No backups (free tier)
  - 3 replica set nodes

10. Brief History and Ecosystem

Timeline

YearMilestone
2007Development begins at 10gen (now MongoDB, Inc.)
2009First public release (v1.0)
2013MongoDB 2.4 -- text search, hashed indexes
2015MongoDB 3.0 -- WiredTiger storage engine (major performance boost)
2017MongoDB IPO on NASDAQ
2018MongoDB 4.0 -- multi-document ACID transactions
2020MongoDB 4.4 -- refinable shard keys, hedged reads
2022MongoDB 6.0 -- queryable encryption, time-series improvements
2023MongoDB 7.0 -- metadata encryption, improved sharding

MongoDB Ecosystem

MongoDB Ecosystem:
  |
  +-- mongosh          (CLI shell)
  +-- Compass          (GUI client)
  +-- Atlas            (Cloud platform)
  +-- Realm            (Mobile database + sync)
  +-- Charts           (Data visualization)
  +-- Connectors       (Spark, Kafka, BI tools)
  +-- Drivers          (Node.js, Python, Java, Go, C#, Ruby, PHP, ...)
  +-- Mongoose         (Node.js ODM -- covered in 3.8.e)
  +-- Atlas Search     (Full-text search powered by Lucene)
  +-- Atlas Data Lake  (Query data in S3 with MQL)

The MongoDB Driver for Node.js

// Native MongoDB driver (low-level)
const { MongoClient } = require('mongodb');

const client = new MongoClient('mongodb://localhost:27017');
await client.connect();

const db = client.db('myapp');
const users = db.collection('users');

await users.insertOne({ name: 'Alice', age: 25 });
const user = await users.findOne({ name: 'Alice' });

await client.close();

In this course, we primarily use Mongoose (an abstraction layer on top of the native driver) for its schema validation, middleware, and developer-friendly API.


11. Key Takeaways

  • MongoDB is a document-oriented NoSQL database that stores data in flexible, JSON-like (BSON) documents.
  • The hierarchy is: Database > Collection > Document > Field.
  • MongoDB is schema-flexible -- documents in the same collection can have different fields.
  • Key features include horizontal scaling (sharding), replication, aggregation, and indexing.
  • MongoDB excels at rapid prototyping, flexible schemas, and high-throughput workloads.
  • Avoid MongoDB for complex transactions, heavy JOINs, or strict relational data.
  • MongoDB Atlas offers a free cloud tier perfect for learning and small projects.
  • The Mongoose ODM adds schema enforcement and developer convenience to MongoDB in Node.js.

12. Explain-It Challenge

You are building a new social media app. Your CTO asks: "Should we use MongoDB or PostgreSQL for our main database?" Write a short analysis covering:

  1. What type of data the app will store (users, posts, comments, likes)
  2. Which data is relational and which is document-like
  3. Your recommendation and reasoning
  4. Whether you might use both (polyglot persistence)

< 3.8.a -- Relational vs Non-Relational | 3.8.c -- Setting Up MongoDB >