Episode 3 — NodeJS MongoDB Backend Architecture / 3.11 — Database Optimization
3.11.b — Indexing Best Practices
Creating indexes is only half the battle. Knowing how to analyze query performance with explain(), follow the ESR rule, and monitor index usage separates good database design from great database design.
< 3.11.a — MongoDB Indexing | 3.11.c — Aggregation Pipeline >
1. explain() — Analyzing Query Performance
The explain() method reveals exactly how MongoDB executes a query:
// Basic explain — shows the query plan
db.users.find({ email: "alice@example.com" }).explain();
// Execution stats — shows actual performance numbers
db.users.find({ email: "alice@example.com" }).explain("executionStats");
// All plans — shows all candidate plans MongoDB considered
db.users.find({ email: "alice@example.com" }).explain("allPlansExecution");
Mongoose explain()
// Mongoose — append .explain() to any query
const result = await User.find({ email: "alice@example.com" })
.explain("executionStats");
console.log(result[0].executionStats);
2. Reading explain() Output
const stats = db.orders.find({ status: "completed", total: { $gte: 100 } })
.sort({ createdAt: -1 })
.explain("executionStats")
.executionStats;
Key Fields to Check
| Field | Meaning | Good Value |
|---|---|---|
executionTimeMillis | Total query time in ms | As low as possible |
totalKeysExamined | Index entries scanned | Close to nReturned |
totalDocsExamined | Documents read from disk | Close to nReturned |
nReturned | Documents returned to client | Expected result count |
stage | Execution stage type | IXSCAN not COLLSCAN |
Ideal Ratios
totalKeysExamined / nReturned ≈ 1 (index is selective)
totalDocsExamined / nReturned ≈ 1 (no wasted reads)
totalDocsExamined = 0 (covered query — best case)
Example Analysis
// BAD — collection scan
{
"executionStats": {
"nReturned": 15,
"executionTimeMillis": 850,
"totalKeysExamined": 0, // No index used
"totalDocsExamined": 1000000, // Scanned ALL documents
"executionStages": {
"stage": "COLLSCAN" // Collection scan — worst case
}
}
}
// GOOD — index scan
{
"executionStats": {
"nReturned": 15,
"executionTimeMillis": 2,
"totalKeysExamined": 15, // Examined only 15 index entries
"totalDocsExamined": 15, // Read only 15 documents
"executionStages": {
"stage": "IXSCAN", // Index scan — fast
"indexName": "status_1_createdAt_-1"
}
}
}
3. IXSCAN vs COLLSCAN
COLLSCAN (Collection Scan) IXSCAN (Index Scan)
┌──────────────────────┐ ┌──────────────────────┐
│ Read doc 1 │ │ B-Tree lookup │
│ Read doc 2 │ │ → doc 45 │
│ Read doc 3 │ │ → doc 891 │
│ ... │ │ → doc 2340 │
│ Read doc 1,000,000 │ │ Done (3 docs) │
│ Return 3 matches │ │ │
│ Time: 850ms │ │ Time: 2ms │
└──────────────────────┘ └──────────────────────┘
Common execution stages:
| Stage | Description | Performance |
|---|---|---|
COLLSCAN | Scans every document | Slowest |
IXSCAN | Uses an index | Fast |
FETCH | Retrieves documents after index scan | Normal |
SORT | In-memory sort (no index for sort) | Can be slow |
SORT_KEY_GENERATOR | Generates sort keys | Normal |
PROJECTION_COVERED | Covered query — no FETCH needed | Fastest |
COUNT_SCAN | Counting using index | Fast |
Red flags in explain output:
COLLSCANon large collectionsSORTstage with largememUsage(in-memory sort)totalDocsExamined>>nReturned
4. The ESR Rule: Equality, Sort, Range
The ESR rule dictates the optimal field order for compound indexes:
┌───────────────────────────────────────────────────┐
│ E — Equality fields first (exact match) │
│ S — Sort fields second (ordering) │
│ R — Range fields last ($gte, $lte, $in) │
└───────────────────────────────────────────────────┘
Example
// Query: Find completed orders from the last 30 days, sorted by amount
db.orders.find({
status: "completed", // E — Equality
createdAt: { $gte: thirtyDaysAgo }, // R — Range
}).sort({ amount: -1 }); // S — Sort
// OPTIMAL index following ESR:
db.orders.createIndex({ status: 1, amount: -1, createdAt: -1 });
// E S R
// WHY this order?
// 1. Equality (status) narrows down to exact matches first — most selective
// 2. Sort (amount) — index is already sorted, no in-memory sort needed
// 3. Range (createdAt) — scans a range within the already-narrowed results
ESR Comparison
// Query: status = "active", sort by price, date >= lastWeek
// ❌ BAD: Range before Sort
db.products.createIndex({ status: 1, createdAt: -1, price: 1 });
// MongoDB can't use index for sort after a range scan
// Results in in-memory SORT stage
// ✅ GOOD: ESR order
db.products.createIndex({ status: 1, price: 1, createdAt: -1 });
// E: status (equality) → S: price (sort) → R: createdAt (range)
// No in-memory sort needed
5. Selectivity
Selectivity measures how many documents an index value matches. High selectivity = fewer matches = faster:
HIGH SELECTIVITY (Good for indexing) LOW SELECTIVITY (Bad for indexing)
┌──────────────────────────────┐ ┌──────────────────────────────┐
│ email: unique per user │ │ gender: only 2-3 values │
│ username: unique per user │ │ isActive: true/false │
│ orderId: unique per order │ │ country: ~200 values for │
│ phone: nearly unique │ │ millions of records │
└──────────────────────────────┘ └──────────────────────────────┘
Rule: Put high-selectivity fields first in compound indexes:
// BAD: Low selectivity first
db.users.createIndex({ isActive: 1, email: 1 });
// isActive: true matches 90% of users — barely narrows the search
// GOOD: High selectivity first
db.users.createIndex({ email: 1, isActive: 1 });
// email matches 1 user — immediately narrows to a single document
6. Monitoring Index Usage
$indexStats
// See how often each index is used
db.users.aggregate([{ $indexStats: {} }]);
// Output:
[
{
"name": "email_1",
"accesses": {
"ops": 145230, // Times this index was used
"since": ISODate(...) // Since server start or index creation
}
},
{
"name": "oldField_1",
"accesses": {
"ops": 0, // NEVER used — candidate for removal
"since": ISODate(...)
}
}
]
List All Indexes
// List all indexes on a collection
db.users.getIndexes();
// Mongoose
const indexes = await User.collection.getIndexes();
console.log(indexes);
7. Removing Unused Indexes
// Drop a specific index
db.users.dropIndex("oldField_1");
db.users.dropIndex({ oldField: 1 });
// Drop all indexes (except _id)
db.users.dropIndexes();
// Mongoose
await User.collection.dropIndex("oldField_1");
Before removing: Check $indexStats to confirm the index has zero or very low usage. Also check if the index is used by background jobs or batch processes that run infrequently.
8. Index Size Considerations
Indexes must fit in RAM for optimal performance:
// Check index sizes
db.users.stats().indexSizes;
// { "_id_": 2457600, "email_1": 1843200, "name_text": 5242880 }
// Total index size
db.users.totalIndexSize();
// 9543680 (bytes)
// Check total database stats
db.stats();
Guidelines:
- Total index size should fit in available RAM
- If indexes exceed RAM, MongoDB pages to disk (very slow)
- Text indexes are particularly large
- Monitor index size as data grows
9. hint() — Forcing Index Usage
Sometimes MongoDB's query planner picks the wrong index. Use hint() to force a specific index:
// Force a specific index
db.orders.find({ status: "active", customerId: id })
.hint({ status: 1, createdAt: -1 })
.explain("executionStats");
// Force collection scan (for testing/comparison)
db.orders.find({ status: "active" })
.hint({ $natural: 1 });
// Mongoose
const orders = await Order.find({ status: "active" })
.hint({ status: 1, createdAt: -1 });
Use hint() sparingly — MongoDB's query planner is usually correct. Use it primarily for testing or when you've verified the planner makes a suboptimal choice.
10. Atlas Performance Advisor
If using MongoDB Atlas (cloud):
- Performance Advisor automatically suggests indexes based on slow queries
- Real-Time Performance Panel shows current operations and their performance
- Query Profiler logs slow queries (similar to setting
profilelevel) - Access via Atlas dashboard: Cluster > Performance Advisor
Atlas Performance Advisor Suggestions:
┌─────────────────────────────────────────────────────────┐
│ Suggested Index: { customerId: 1, status: 1 } │
│ Impact: Improves 342 queries/hour │
│ Avg improvement: 450ms → 3ms │
│ [Create Index] [Dismiss] │
└─────────────────────────────────────────────────────────┘
11. Practical Workflow: Optimizing a Slow Query
Step 1: Identify slow queries
→ Enable profiling: db.setProfilingLevel(1, { slowms: 100 })
→ Check: db.system.profile.find().sort({ ts: -1 })
Step 2: Analyze with explain()
→ Run the query with .explain("executionStats")
→ Check for COLLSCAN, high totalDocsExamined
Step 3: Design the index
→ Apply ESR rule: Equality → Sort → Range
→ Put high-selectivity fields first
→ Consider covered queries
Step 4: Create the index
→ db.collection.createIndex({...}, { background: true })
→ In production, create indexes during low-traffic periods
Step 5: Verify improvement
→ Run explain() again — confirm IXSCAN
→ Compare totalDocsExamined and executionTimeMillis
Step 6: Monitor ongoing
→ Check $indexStats periodically
→ Remove unused indexes
Key Takeaways
- Always use
explain("executionStats")to verify indexes are working - COLLSCAN on large collections is always a problem — add appropriate indexes
- ESR rule — Equality, Sort, Range — determines optimal compound index field order
- High selectivity fields first — fields with many unique values narrow results fastest
- Monitor with
$indexStats— remove unused indexes to save RAM and write performance hint()forces a specific index but should be used sparingly- Indexes must fit in RAM — monitor sizes as data grows
Explain-It Challenge
You receive a bug report: "The orders page takes 15 seconds to load." The page shows a user's recent orders filtered by status, sorted by date. You run
explain()and see aCOLLSCANwithtotalDocsExamined: 5,000,000. Walk through your complete debugging and optimization process: what indexes you would create and why, how you would verify the improvement, and what ongoing monitoring you would set up.