Episode 3 — NodeJS MongoDB Backend Architecture / 3.12 — Logging and Monitoring

3.12.a — Why Logging Matters

console.log() is not a logging strategy. Production applications need structured, leveled, persistent logs that help you understand what happened, when it happened, and why it happened — especially when things go wrong at 3 AM.

< README | 3.12.b — Logging Libraries >

1. The Problem with console.log()

Every developer starts with console.log(). It works in development, but it fails in production:

// What most beginners do:
console.log("User logged in");
console.log("Error:", err);
console.log("Order placed:", orderId);

// Problems:
// 1. No log levels — cannot filter errors from info messages
// 2. No timestamps — when did this happen?
// 3. No structure — cannot search or parse programmatically
// 4. No persistence — lost when the process restarts
// 5. No context — which user? which request? which server?
// 6. Cannot be turned off in production without code changes

console.log() in production:
┌──────────────────────────────────────────────────────┐
│ User logged in                                        │
│ Error: TypeError: Cannot read property 'name' of null │
│ Order placed: 12345                                   │
│ User logged in                                        │
│ User logged in                                        │
│ Error: ECONNREFUSED                                   │
│ ... 100,000 more unstructured lines ...               │
│                                                       │
│ ❌ Good luck finding that one critical error           │
└──────────────────────────────────────────────────────┘

2. What Is Proper Logging?

Proper logging means recording application events in a structured, leveled, and persistent way:

{
  "level": "error",
  "message": "Database connection failed",
  "timestamp": "2025-06-15T14:23:45.123Z",
  "service": "order-service",
  "requestId": "req-abc-123",
  "userId": "user-456",
  "error": {
    "name": "MongoNetworkError",
    "message": "connect ECONNREFUSED 127.0.0.1:27017",
    "stack": "MongoNetworkError: connect ECONNREFUSED..."
  }
}

This log entry tells you:

What happened (database connection failed)
When it happened (exact timestamp)
Where it happened (order-service)
Who was affected (user-456)
Which request triggered it (req-abc-123)
Why it happened (connection refused — MongoDB is down)

3. Log Levels

Log levels categorize messages by severity. This allows you to filter noise and focus on what matters:

Level	Priority	When to Use	Example
`error`	0 (highest)	Something broke and needs fixing	DB connection lost, unhandled exception
`warn`	1	Potential problem, not critical yet	Deprecated API call, slow query, retry attempt
`info`	2	Normal operations worth recording	User login, order placed, server started
`http`	3	HTTP request/response details	GET /api/users 200 45ms
`verbose`	4	More detail than info	Config loaded, cache refreshed
`debug`	5	Developer debugging information	Variable values, function entry/exit
`silly`	6 (lowest)	Extremely verbose tracing	Every loop iteration, raw data dumps

How Log Levels Work

Setting a log level means "log this level and everything above it":

Level set to "warn":
  error  ✅ logged
  warn   ✅ logged
  info   ❌ not logged
  debug  ❌ not logged

Level set to "info":
  error  ✅ logged
  warn   ✅ logged
  info   ✅ logged
  debug  ❌ not logged

Level set to "debug":
  error  ✅ logged
  warn   ✅ logged
  info   ✅ logged
  debug  ✅ logged

Environment-Based Levels

// Set log level based on environment
const LOG_LEVEL = process.env.LOG_LEVEL || (
  process.env.NODE_ENV === "production" ? "warn" :
  process.env.NODE_ENV === "test" ? "error" :
  "debug"  // development
);

4. Structured vs Unstructured Logging

Unstructured Logging (bad)

[2025-06-15 14:23:45] ERROR: Database connection failed for user john@example.com on order #12345

Looks readable to humans
Impossible to parse programmatically
Cannot search by specific fields
Different developers format messages differently

Structured Logging (good)

{
  "level": "error",
  "message": "Database connection failed",
  "timestamp": "2025-06-15T14:23:45.123Z",
  "userId": "john@example.com",
  "orderId": "12345",
  "error": "ECONNREFUSED"
}

Machine-parseable (JSON)
Searchable by any field
Consistent format across the team
Works with log aggregation tools (ELK, Datadog, CloudWatch)

Why JSON?

Structured JSON logs enable:
┌─────────────────────────────────────────────┐
│ 1. Search: find all errors for user X        │
│ 2. Filter: show only "error" level logs      │
│ 3. Aggregate: count errors per hour          │
│ 4. Alert: trigger alarm when error rate > 5% │
│ 5. Dashboard: visualize request patterns     │
└─────────────────────────────────────────────┘

5. What to Log

Always Log

Event	Why	Example
Application startup/shutdown	Know when services restart	`"Server started on port 3000"`
Authentication events	Security audit trail	`"User login successful"` / `"Login failed"`
Authorization failures	Detect unauthorized access	`"Access denied to /admin for user X"`
API errors (4xx, 5xx)	Debug and monitor errors	`"POST /api/orders 500 Internal Server Error"`
Database errors	Detect connection/query issues	`"MongoDB connection lost"`
External API calls	Track third-party dependencies	`"Stripe payment API responded 200 in 340ms"`
Business events	Audit trail	`"Order #123 placed by user X"`
Performance warnings	Detect degradation	`"Query took 5200ms (threshold: 1000ms)"`

Never Log

Data	Why	Risk
Passwords	Security	Credential exposure
Credit card numbers	PCI compliance	Financial fraud
Social security numbers	Privacy laws	Identity theft
API keys / tokens	Security	Unauthorized access
Full request bodies (with sensitive data)	Privacy	Data leak
Personal health information	HIPAA	Legal violation

// BAD — logging sensitive data
logger.info("User login", { email: user.email, password: req.body.password });

// GOOD — log only what is needed
logger.info("User login successful", { userId: user._id, email: user.email });

6. What NOT to Log (Performance)

Excessive logging hurts performance:

// BAD — logging inside tight loops
for (const item of items) {
  logger.debug(`Processing item ${item.id}`);  // 10,000 log writes per request
  processItem(item);
}

// GOOD — log summary
logger.info(`Processing ${items.length} items`);
items.forEach(processItem);
logger.info(`Finished processing ${items.length} items`);

7. Logging by Environment

Environment	Log Level	Output	Format
Development	`debug`	Console (terminal)	Colorized, human-readable
Testing	`error`	Console or silent	Minimal
Staging	`info`	File + console	JSON (structured)
Production	`warn` or `info`	File + log aggregator	JSON (structured)

// Development: colorful, readable console output
// ✅ 2025-06-15 14:23:45 [INFO]: Server started on port 3000
// ⚠️ 2025-06-15 14:23:46 [WARN]: Slow query detected (5200ms)
// ❌ 2025-06-15 14:23:47 [ERROR]: Database connection failed

// Production: JSON for log aggregation tools
// {"level":"info","message":"Server started","port":3000,"timestamp":"2025-06-15T14:23:45.123Z"}
// {"level":"error","message":"Database connection failed","error":"ECONNREFUSED","timestamp":"..."}

8. Log Rotation

Log files grow continuously and can fill up disk space. Log rotation archives old logs and creates new files:

Without rotation:            With rotation:
┌─────────────────────┐     ┌─────────────────────┐
│ app.log (50GB)       │     │ app-2025-06-15.log  │  (today)
│ ... still growing    │     │ app-2025-06-14.log  │  (yesterday)
│                      │     │ app-2025-06-13.log  │  (2 days ago)
│ ❌ Disk full!        │     │ ... auto-deleted     │  (older than 14 days)
└─────────────────────┘     └─────────────────────┘

Rotation strategies:

By date — new file each day (most common)
By size — new file when current exceeds a size limit
By count — keep only the last N log files
Combined — rotate daily, delete after 14 days, max 100MB per file

9. Logging Context and Metadata

Always include context that helps with debugging:

// BAD — no context
logger.error("Payment failed");

// GOOD — rich context
logger.error("Payment failed", {
  userId: req.user._id,
  orderId: order._id,
  amount: order.total,
  paymentMethod: "stripe",
  stripeError: err.code,
  requestId: req.id,
});

Request-Scoped Context

Attach metadata to every log within a request lifecycle:

// Every log in this request includes requestId and userId
// (Implementation details in 3.12.c)
logger.info("Order validation passed", { requestId, userId, orderId });
logger.info("Payment processed", { requestId, userId, orderId, amount });
logger.info("Order confirmation sent", { requestId, userId, orderId });
// All three logs can be correlated by requestId

10. Logging vs Monitoring

Aspect	Logging	Monitoring
Purpose	Record what happened	Watch for problems in real-time
When	After the fact (forensic)	Real-time (proactive)
Format	Text/JSON log entries	Metrics, dashboards, alerts
Tools	Winston, Pino, ELK	Prometheus, Grafana, Datadog
Example	"DB query took 5200ms"	Alert: "p99 latency > 2000ms"

They work together: monitoring detects the problem, logs help you understand why it happened.

Key Takeaways

console.log() is not a logging strategy — use a proper logging library with levels, structure, and persistence
Log levels filter noise — set debug in development, warn or info in production
Structured (JSON) logs are searchable — essential for log aggregation and alerting
Log what matters, skip what is sensitive — never log passwords, tokens, or PII
Log rotation prevents disk issues — rotate daily, limit file size, auto-delete old logs
Include context in every log — requestId, userId, and relevant business data
Logging and monitoring are complementary — logs record events, monitoring watches metrics

Explain-It Challenge

Your Node.js API serves 10,000 requests per minute. A customer reports that their order failed but received no error message. Using only log data, explain how you would trace the request from entry to failure. What information would you need in the logs? How would structured logging help? Design a logging schema for an order placement flow that includes all the context needed for debugging.