Episode 3 — NodeJS MongoDB Backend Architecture / 3.12 — Logging and Monitoring
3.12 — Interview Questions: Logging and Monitoring
These 10 questions cover the most commonly asked logging and monitoring topics in Node.js technical interviews — from fundamentals to production architecture.
< Exercise Questions | Quick Revision >
Quick-Fire Table
| # | Question | Level | Key Topic |
|---|---|---|---|
| 1 | Why is console.log() bad for production? | Beginner | Logging fundamentals |
| 2 | What are log levels and how do they work? | Beginner | Log levels |
| 3 | What is structured logging? | Beginner | Log format |
| 4 | Compare Winston, Pino, and Morgan | Intermediate | Libraries |
| 5 | How do you set up log rotation? | Intermediate | Operations |
| 6 | What is a request ID and why is it important? | Intermediate | Correlation |
| 7 | How do you handle unhandled promise rejections? | Intermediate | Error handling |
| 8 | What is the ELK stack? | Intermediate | Architecture |
| 9 | Design a logging strategy for a production API | Advanced | System design |
| 10 | How do you debug a production error using only logs? | Advanced | Debugging |
Beginner Level
Q1. Why is console.log() insufficient for production applications?
Model Answer:
console.log() has several critical limitations that make it unsuitable for production:
- No log levels — cannot distinguish errors from info messages, cannot filter by severity
- No timestamps — impossible to know when events occurred
- No structure — free-form text cannot be parsed or searched programmatically
- No persistence — output goes to stdout/stderr and is lost when the process restarts
- No context — does not include request ID, user ID, or service metadata
- No configurability — cannot change verbosity without modifying code
- No rotation — cannot manage output size or archive old logs
Instead, use a logging library like Winston or Pino that provides leveled, structured, persistent logs with configurable transports.
Q2. What are log levels and how do they work?
Model Answer:
Log levels categorize messages by severity, from most critical to least:
| Level | Severity | Use |
|---|---|---|
error | Highest | Something broke and needs immediate attention |
warn | High | Potential problem, not critical yet |
info | Medium | Normal operations worth recording |
http | Lower | HTTP request/response details |
debug | Low | Detailed debugging information |
silly | Lowest | Extremely verbose tracing |
Setting a log level means "log this level and everything more severe." For example, setting level: "warn" logs error and warn messages, but suppresses info, debug, and below.
Best practice:
- Production:
warnorinfo— minimal noise, only important events - Development:
debug— full detail for debugging - Testing:
erroror silent — reduce output clutter
Q3. What is structured logging and why is it important?
Model Answer:
Structured logging means writing log entries in a consistent, machine-parseable format — typically JSON:
{
"level": "error",
"message": "Payment failed",
"timestamp": "2025-06-15T14:23:45.123Z",
"userId": "user-456",
"orderId": "order-789",
"error": "Stripe timeout"
}
Compared to unstructured logging: "ERROR: Payment failed for user user-456 on order order-789 — Stripe timeout"
Structured logging is important because:
- Searchable — query by any field (find all errors for user-456)
- Parseable — log aggregation tools (ELK, Datadog, CloudWatch) can index and analyze fields
- Consistent — enforces a standard format across the team and services
- Alertable — set up alerts on specific field values (error rate, affected users)
- Analyzable — compute metrics like errors per hour, average response time
Intermediate Level
Q4. Compare Winston, Pino, and Morgan. When would you use each?
Model Answer:
| Feature | Winston | Pino | Morgan |
|---|---|---|---|
| Type | General-purpose logger | General-purpose logger | HTTP request logger |
| Speed | Moderate | Fastest (~5x Winston) | N/A (middleware) |
| Formats | JSON, simple, custom | JSON (pino-pretty for dev) | Predefined strings |
| Transports | Built-in (file, console, HTTP) | Separate process | Stream |
| Best for | Most applications | High-throughput APIs | HTTP access logs |
When to use each:
- Winston: General-purpose logging for most Node.js applications. Largest ecosystem, flexible formats and transports.
- Pino: When performance matters (APIs handling >10K req/s). Minimal overhead due to asynchronous design.
- Morgan: Specifically for HTTP request logging in Express. Always use alongside (not instead of) Winston or Pino.
Common combination: Winston + Morgan — Morgan for HTTP access logs piped through Winston's transports.
Q5. How do you set up log rotation and why is it necessary?
Model Answer:
Log rotation creates new log files periodically and manages old ones to prevent disk space exhaustion. Without rotation, a single log file grows indefinitely and can fill the entire disk.
Using winston-daily-rotate-file:
const winston = require("winston");
require("winston-daily-rotate-file");
const transport = new winston.transports.DailyRotateFile({
filename: "logs/app-%DATE%.log",
datePattern: "YYYY-MM-DD",
maxSize: "20m", // Max 20MB per file
maxFiles: "14d", // Keep for 14 days
zippedArchive: true, // Compress old files
});
This creates files like app-2025-06-15.log, compresses them to .gz after the day ends, and deletes files older than 14 days.
Rotation strategies:
- By date: New file each day (most common)
- By size: New file when current exceeds a limit
- Combined: Daily rotation with per-file size limits and age-based deletion
Q6. What is a request ID and why is it important?
Model Answer:
A request ID is a unique identifier (typically a UUID) assigned to each incoming HTTP request. It is included in every log entry generated during that request's lifecycle.
// Middleware to assign request ID
const { v4: uuidv4 } = require("uuid");
app.use((req, res, next) => {
req.id = req.headers["x-request-id"] || uuidv4();
res.setHeader("x-request-id", req.id);
next();
});
Why it matters:
- Traceability: In a system handling thousands of concurrent requests, you can find all logs for a single request by searching for its ID
- Debugging: When a user reports an issue, the request ID (from the response header or error page) lets you trace the exact execution path
- Distributed tracing: In microservices, the request ID propagates across services, enabling end-to-end tracing
- Correlation: Ties together HTTP logs, application logs, database queries, and external API calls for one request
Q7. How do you handle unhandled promise rejections in Node.js?
Model Answer:
Unhandled promise rejections occur when a Promise is rejected but no .catch() or try/catch handles the error. In modern Node.js, they can crash the process.
process.on("unhandledRejection", (reason, promise) => {
logger.error("Unhandled Promise Rejection", {
reason: reason instanceof Error ? reason.message : String(reason),
stack: reason instanceof Error ? reason.stack : undefined,
});
// Optionally exit (recommended in production)
// process.exit(1);
});
process.on("uncaughtException", (err) => {
logger.error("Uncaught Exception", {
error: err.message,
stack: err.stack,
});
// MUST exit — process state is unreliable
process.exit(1);
});
Key distinction:
- Unhandled rejection: An un-caught rejected Promise. The process may continue, but it is in an unknown state.
- Uncaught exception: A thrown error with no handler. The process MUST exit because its state is unreliable.
Both should be logged with full context and ideally sent to an external monitoring service (Sentry) before exit.
Q8. What is the ELK stack?
Model Answer:
ELK stands for Elasticsearch, Logstash, and Kibana — a suite for centralized log management:
| Component | Role |
|---|---|
| Elasticsearch | Stores and indexes log data for fast full-text search |
| Logstash | Ingests logs, transforms/parses them, and forwards to Elasticsearch |
| Kibana | Web UI for searching, filtering, and visualizing logs |
| Filebeat | Lightweight agent that ships log files to Logstash/Elasticsearch |
Workflow:
- Application writes structured JSON logs to files (via Winston/Pino)
- Filebeat monitors log files and ships entries to Logstash/Elasticsearch
- Elasticsearch indexes the logs for fast searching
- Kibana provides dashboards, search, and alerting
Cloud alternatives: AWS CloudWatch Logs, GCP Cloud Logging, Datadog, Better Stack (Logtail).
Advanced Level
Q9. Design a complete logging strategy for a production API handling 50,000 requests per minute.
Model Answer:
Library choice: Pino (for performance at this scale) + Morgan (HTTP access logs piped through Pino).
Log levels:
- Production:
info(captures operations without debug noise) - Error alerting threshold: immediate notification for
errorlevel
Configuration:
- Structured JSON output for all logs
- Daily log rotation with 14-day retention, compressed archives
- Request ID middleware for correlation
- Child loggers per request with automatic context (requestId, userId)
Transports:
- Local: Daily rotating files (combined + error-only)
- External: Ship to centralized logging (ELK/Datadog/CloudWatch) via Filebeat or direct HTTP transport
Error handling:
- Global Express error middleware with full context logging
unhandledRejectionanduncaughtExceptionhandlers- Sentry integration for real-time error alerting and tracking
What to log:
- All HTTP requests (method, URL, status, response time)
- Authentication events (login, logout, failed attempts)
- Business events (orders, payments, critical operations)
- Errors with full stack traces and request context
- External API calls (duration, status, errors)
What NOT to log:
- Passwords, tokens, credit card numbers, PII
- Health check endpoints
- Static asset requests
- Debug-level details (too noisy at this scale)
Alerting:
- Error rate exceeds 5% of requests
- Response time p99 exceeds 2000ms
- Unhandled rejection or uncaught exception occurs
- External service failure rate exceeds threshold
Q10. A user reports that their order failed 3 hours ago. Walk through how you would debug this using only logs.
Model Answer:
Step 1: Get identifiers
- Ask the user for their user ID, email, or order ID
- If available, get the request ID from the error response or browser network tab
Step 2: Search error logs
Search: level="error" AND userId="user-456" AND time > 3h ago
- Look for the specific error message and stack trace
- Identify the exact timestamp and request ID
Step 3: Trace the full request
Search: requestId="req-abc-123"
- Follow the request from entry to failure
- See what steps succeeded before the error
Step 4: Identify root cause
- Read the error message and stack trace
- Check if it is an operational error (expected) or programming error (bug)
- Look for patterns (did this error happen to other users too?)
Step 5: Check context
- Was the database available? (check for connection errors around that time)
- Was an external service down? (check API call logs)
- Was there a deployment around that time? (check deployment logs)
Step 6: Verify and fix
- Reproduce in staging if possible
- Deploy fix with additional logging if root cause is unclear
- Set up an alert to detect this error class in the future
This process is only possible with structured logging, request ID correlation, and proper error context in the logs.