Episode 3 — NodeJS MongoDB Backend Architecture / 3.10 — Input Validation

3.10 — Interview Questions: Input Validation

Common interview questions about input validation, security, and error handling in Node.js/Express applications.

Beginner Level

Q1: Why is server-side validation necessary even if the frontend already validates input?

A: Client-side validation can be trivially bypassed using tools like curl, Postman, or browser DevTools. Any attacker can send requests directly to the API. Server-side validation is the only layer that cannot be circumvented by the end user, making it mandatory for security and data integrity.

Q2: What is the difference between validation and sanitization?

A: Validation checks whether data meets certain rules and accepts or rejects it (e.g., "is this a valid email?"). Sanitization transforms data to make it safe or consistent (e.g., trimming whitespace, escaping HTML characters, converting to lowercase). Both should be applied — validation first, then sanitization.

Q3: What HTTP status code should you return for validation errors?

A: Either 400 (Bad Request) or 422 (Unprocessable Entity). 400 indicates the server cannot process the request due to a client error. 422 is more specific — the request is syntactically correct (valid JSON) but semantically invalid (data fails business rules). Most APIs use 400 for simplicity and consistency.

Q4: How does express-validator work under the hood?

A: express-validator uses middleware functions that run in Express's request pipeline. Each validator (like body('email').isEmail()) returns a middleware that reads the specified field from the request, runs validation checks, and attaches the result to the request object. validationResult(req) then collects all results for processing.

Intermediate Level

Q5: How would you prevent NoSQL injection in a MongoDB/Express application?

Validate that inputs are the expected types (strings, numbers) — not objects or arrays
Use express-validator's .isString() to ensure query parameters are strings
Use Mongoose's schema typing which rejects unexpected types
Never pass raw req.body or req.query directly to MongoDB queries
Use parameterized queries and avoid $where with user input
Consider using mongo-sanitize to strip keys starting with $

Q6: Compare Zod and express-validator. When would you choose one over the other?

express-validator: Best for Express-only projects, especially JavaScript (not TypeScript). Integrates directly as middleware, has built-in sanitizers, and is familiar to Express developers.
Zod: Best for TypeScript projects or when you need framework-agnostic schemas. Provides type inference (z.infer), works in both browser and server, and enables sharing schemas between frontend and backend.
Choose express-validator for simple Express APIs. Choose Zod for TypeScript projects, full-stack shared validation, or when building schemas used outside Express.

Q7: What is the "mass assignment" vulnerability and how do you prevent it?

A: Mass assignment occurs when an API blindly accepts all fields from req.body and passes them to the database. An attacker can add fields like role: "admin" or verified: true. Prevention: (1) validate and whitelist expected fields only, (2) use pick or destructuring to extract only allowed fields, (3) use Mongoose's select and field-level permissions.

Q8: How do you handle async validation (like checking email uniqueness) in express-validator and Zod?

express-validator: Use .custom(async (value) => { ... }) — the custom validator returns a promise, and express-validator awaits it.
Zod: Use .refine(async (value) => { ... }) — then you must call schema.parseAsync() or schema.safeParseAsync() instead of the sync versions.

Q9: Design a consistent API error response format. What fields should it include?

A: A good format includes:

success: false — quick boolean check for the client
error.code — machine-readable code like VALIDATION_ERROR, NOT_FOUND
error.message — human-readable description
error.details — field-level errors for validation (array or object)
Optionally: error.requestId for debugging, error.params for i18n interpolation

Advanced Level

Q10: How would you implement validation for a multi-tenant SaaS application where different tenants have different validation rules?

A: Store tenant-specific validation rules in the database (e.g., required fields, field length limits, allowed values). Create a middleware factory that loads the tenant's rules and dynamically builds validation schemas. With Zod, you can compose schemas programmatically. With express-validator, you can build checkSchema() objects dynamically. Cache schemas per tenant to avoid rebuilding on every request.

Q11: How do you handle validation errors in a microservices architecture?

Each service validates its own inputs independently
Use a shared error format library across all services
API gateways can perform preliminary validation (rate limiting, auth, basic format checks)
When Service A calls Service B, validation errors from B should be propagated back with context
Use correlation IDs to trace validation failures across services
Consider using Protocol Buffers or JSON Schema as shared schema definitions

Q12: Explain the ESR (Equality, Sort, Range) rule and how it relates to validating query parameters for MongoDB.

A: ESR is a rule for ordering fields in compound indexes: Equality fields first, then Sort fields, then Range fields. When validating query parameters, you should ensure that: (1) equality filters are validated as exact types (string, ObjectId), (2) sort fields are from an allowed whitelist, (3) range values (min/max) are validated as the correct numeric types. Proper validation ensures queries can use indexes efficiently.

Q13: How would you build a validation system that shares schemas between a React frontend and Express backend?

A: Use Zod schemas in a shared package:

Create a shared @myapp/schemas package with Zod schemas
Use z.infer<typeof Schema> to generate TypeScript types
Frontend imports schemas for form validation (with React Hook Form's zodResolver)
Backend imports the same schemas for API validation middleware
Any schema change is automatically reflected in both layers
Build and publish the shared package, or use a monorepo with workspaces

Q14: What are the security implications of returning validation error details in API responses?

Never include the rejected value for sensitive fields (passwords, tokens)
Detailed field names can reveal your data model to attackers
Error messages like "email not found" vs "wrong password" enable user enumeration
For auth endpoints, use generic messages: "Invalid credentials"
For public registration, "email already registered" is a necessary UX trade-off but is technically information disclosure
In production, never include stack traces or internal error details

Q15: How do you validate request bodies that can be very large (e.g., bulk import of 10,000 records)?

Set express.json({ limit: '10mb' }) to control payload size
Validate the array length first before iterating
Use streaming validation for very large payloads (process records in chunks)
Consider accepting file uploads (CSV) instead of JSON for bulk data
Validate in parallel using Promise.all for async checks, but batch database queries
Return partial success responses: { success: 8500, failed: 1500, errors: [...] }
For very large imports, use a queue-based approach (accept, validate async, notify)