Episode 3 — NodeJS MongoDB Backend Architecture / 3.11 — Database Optimization
3.11.a — MongoDB Indexing
Indexes are special data structures that store a small portion of the collection's data in an easy-to-traverse form, enabling MongoDB to find documents without scanning every document in a collection.
< README | 3.11.b — Indexing Best Practices >
1. What Are Indexes?
An index in MongoDB is a B-tree data structure that maps field values to the documents containing those values. Think of it like the index at the back of a textbook — instead of reading every page to find a topic, you look up the topic in the index and jump directly to the right page.
WITHOUT INDEX (Collection Scan) WITH INDEX (Index Scan)
┌─────────────────────────────┐ ┌──────────────────┐
│ Scan doc 1... no match │ │ B-Tree Index │
│ Scan doc 2... no match │ │ │
│ Scan doc 3... no match │ │ "alice" │
│ Scan doc 4... MATCH! │ │ / \ │
│ Scan doc 5... no match │ │ "bob" "dave" │
│ ... scan ALL N documents │ │ | | │
│ Time: O(n) │ │ doc4 doc7 │
└─────────────────────────────┘ │ Time: O(log n) │
└──────────────────┘
| Metric | Without Index | With Index |
|---|---|---|
| Scan type | COLLSCAN (collection scan) | IXSCAN (index scan) |
| Documents examined | All N documents | Only matching documents |
| Time complexity | O(n) | O(log n) |
| 1M documents | ~1000ms | ~1ms |
2. The Default _id Index
Every MongoDB collection automatically has an index on the _id field. This cannot be removed.
// This query always uses the _id index
db.users.findOne({ _id: ObjectId("507f1f77bcf86cd799439011") });
// IXSCAN on _id — fast regardless of collection size
3. Single-Field Indexes
The simplest index type — indexes a single field:
// MongoDB Shell
db.users.createIndex({ email: 1 }); // 1 = ascending order
db.users.createIndex({ age: -1 }); // -1 = descending order
// Mongoose schema definition
const userSchema = new mongoose.Schema({
email: { type: String, index: true }, // Simple index
username: { type: String, unique: true }, // Unique index
createdAt: { type: Date, index: true },
});
// Or define indexes separately
userSchema.index({ email: 1 });
// These queries use the { email: 1 } index:
db.users.find({ email: "alice@example.com" }); // Equality
db.users.find({ email: { $regex: /^alice/ } }); // Prefix regex
db.users.find({}).sort({ email: 1 }); // Sort ascending
db.users.find({}).sort({ email: -1 }); // Sort descending (index works both ways for single field)
4. Compound Indexes
Indexes on multiple fields — field order matters significantly:
// Create compound index
db.orders.createIndex({ status: 1, createdAt: -1 });
// This index supports these queries:
db.orders.find({ status: "completed" }); // Uses first field
db.orders.find({ status: "completed" }).sort({ createdAt: -1 }); // Uses both fields
db.orders.find({ status: "completed", createdAt: { $gte: date } }); // Uses both fields
// This query does NOT efficiently use this index:
db.orders.find({ createdAt: { $gte: date } }); // Skipping first field — may COLLSCAN
db.orders.find({}).sort({ createdAt: -1 }); // Cannot use — first field not filtered
Index Prefix Rule
A compound index supports queries on any prefix of the indexed fields:
// Index: { a: 1, b: 1, c: 1 }
// Supports queries on:
// { a } ✅ (prefix: a)
// { a, b } ✅ (prefix: a, b)
// { a, b, c } ✅ (full index)
// { b } ❌ (not a prefix)
// { b, c } ❌ (not a prefix)
// { a, c } ⚠️ (uses prefix {a}, then scans for c)
5. Text Indexes
Enable full-text search on string fields:
// Create text index
db.products.createIndex({ name: "text", description: "text" });
// Search for products
db.products.find({ $text: { $search: "wireless bluetooth headphones" } });
// Search with score — sort by relevance
db.products.find(
{ $text: { $search: "wireless headphones" } },
{ score: { $meta: "textScore" } }
).sort({ score: { $meta: "textScore" } });
// Phrase search — exact phrase match
db.products.find({ $text: { $search: '"noise cancelling"' } });
// Exclude terms
db.products.find({ $text: { $search: "headphones -wired" } });
Limitations:
- Only one text index per collection
- Text indexes can be large (indexes every word)
- For advanced search, consider Atlas Search or Elasticsearch
6. Wildcard Indexes
For flexible or dynamic schemas where field names are not known in advance:
// Index ALL fields in the document
db.collection.createIndex({ "$**": 1 });
// Index all fields under a specific path
db.products.createIndex({ "attributes.$**": 1 });
// Useful when documents have varying structures:
// { attributes: { color: "red", size: "L" } }
// { attributes: { weight: 500, material: "cotton" } }
// Wildcard index covers queries on ANY attribute field
Use sparingly — wildcard indexes can be very large and are not a substitute for targeted indexes.
7. Unique Indexes
Enforce uniqueness on a field:
// No two users can have the same email
db.users.createIndex({ email: 1 }, { unique: true });
// Compound unique — combination must be unique
db.ratings.createIndex({ userId: 1, productId: 1 }, { unique: true });
// Same user cannot rate the same product twice
// Mongoose
const userSchema = new mongoose.Schema({
email: { type: String, unique: true, required: true },
});
Important: Unique indexes allow ONE document with a null/missing field. If multiple documents can have a missing field, combine with sparse or partialFilterExpression.
8. Sparse, Partial, and TTL Indexes
Sparse Indexes
Only index documents that have the indexed field:
// Only index documents where 'phone' exists
db.users.createIndex({ phone: 1 }, { sparse: true });
// Documents without 'phone' field are not in the index
// Saves space when many documents lack the field
Partial Indexes
Index only documents matching a filter — more flexible than sparse:
// Only index active users
db.users.createIndex(
{ email: 1 },
{ partialFilterExpression: { isActive: true } }
);
// Only index orders over $100
db.orders.createIndex(
{ customerId: 1, createdAt: -1 },
{ partialFilterExpression: { total: { $gte: 100 } } }
);
// Smaller index, faster updates, only covers queries that match the filter
TTL (Time-To-Live) Indexes
Automatically delete documents after a specified time:
// Delete sessions after 24 hours
db.sessions.createIndex({ createdAt: 1 }, { expireAfterSeconds: 86400 });
// Delete at a specific time (set expiresAt in the document)
db.tokens.createIndex({ expiresAt: 1 }, { expireAfterSeconds: 0 });
// Document: { token: "abc", expiresAt: ISODate("2025-01-01T00:00:00Z") }
// Deleted when current time passes expiresAt
9. Index Direction: 1 vs -1
For single-field indexes, direction does not matter — MongoDB can traverse in either direction.
For compound indexes, direction matters for sort operations:
// Index: { price: 1, rating: -1 }
// This means: sort by price ascending, then rating descending
// These sorts are supported:
db.products.find({}).sort({ price: 1, rating: -1 }); // ✅ Matches index
db.products.find({}).sort({ price: -1, rating: 1 }); // ✅ Reverse of index (OK)
// These sorts are NOT supported:
db.products.find({}).sort({ price: 1, rating: 1 }); // ❌ Mismatch
db.products.find({}).sort({ price: -1, rating: -1 }); // ❌ Mismatch
10. Covered Queries
A query is "covered" when the index contains ALL fields the query needs — MongoDB never reads the actual documents:
// Index: { email: 1, name: 1 }
// Covered query — only needs fields in the index
db.users.find(
{ email: "alice@example.com" }, // Filter: in index
{ email: 1, name: 1, _id: 0 } // Projection: in index, _id excluded
);
// totalDocsExamined: 0 (never reads documents — fastest possible)
// NOT covered — needs 'age' which is not in the index
db.users.find(
{ email: "alice@example.com" },
{ email: 1, name: 1, age: 1, _id: 0 } // 'age' requires document read
);
11. Index Intersection
MongoDB can sometimes combine multiple single-field indexes:
// Indexes: { status: 1 } and { createdAt: 1 }
// MongoDB MAY use both for:
db.orders.find({ status: "active", createdAt: { $gte: date } });
// But a compound index { status: 1, createdAt: 1 } is almost always faster
In practice: Do not rely on index intersection. Create compound indexes for common query patterns.
12. When NOT to Index
| Scenario | Reason |
|---|---|
| Small collections (< 1000 docs) | Collection scan is fast enough |
| Write-heavy, read-light collections | Indexes slow down every write |
| Fields with low selectivity | { gender: 1 } — only 2-3 values, index barely helps |
| Fields rarely used in queries | Unused index wastes RAM and slows writes |
| Rapidly changing fields | Index constantly rebuilt |
13. Mongoose Index Definition
const productSchema = new mongoose.Schema({
name: { type: String, required: true },
price: { type: Number, required: true },
category: { type: String, required: true },
tags: [String],
description: String,
seller: { type: mongoose.Schema.Types.ObjectId, ref: 'User' },
createdAt: { type: Date, default: Date.now },
});
// Single-field indexes
productSchema.index({ price: 1 });
productSchema.index({ category: 1 });
// Compound index
productSchema.index({ category: 1, price: -1 });
// Text index
productSchema.index({ name: 'text', description: 'text' });
// Unique compound index
productSchema.index({ seller: 1, name: 1 }, { unique: true });
// Partial index
productSchema.index(
{ price: 1 },
{ partialFilterExpression: { price: { $exists: true } } }
);
// TTL index
productSchema.index({ createdAt: 1 }, { expireAfterSeconds: 2592000 }); // 30 days
const Product = mongoose.model('Product', productSchema);
Auto-Index Behavior
// By default, Mongoose calls createIndex for each index in the schema
// when the model is first used. In production, disable auto-indexing:
mongoose.connect(uri, { autoIndex: false });
// Or per-schema:
const schema = new mongoose.Schema({...}, { autoIndex: false });
// Then create indexes manually during deployment:
// await Product.createIndexes();
Key Takeaways
- Indexes trade write speed for read speed — every insert/update must update all indexes
- Compound index field order matters — queries must use a prefix of the index fields
- Use unique indexes to enforce data constraints at the database level
- TTL indexes auto-delete expired documents (sessions, tokens, logs)
- Covered queries are the fastest — index contains all needed fields
- Do not over-index — unused indexes waste RAM and slow writes
- Disable auto-indexing in production — create indexes during deployment
Explain-It Challenge
You are designing the database for an e-commerce platform with 10 million products. Users can search products by name, filter by category and price range, and sort by price or date. Each product belongs to a seller, and sellers can view their own products. Design a complete indexing strategy: list every index you would create, explain why, and identify which queries each index supports. Also explain which indexes you deliberately chose NOT to create and why.