Episode 4 — Generative AI Engineering / 4.6 — Schema Validation with Zod
4.6.d — Handling Invalid Responses
In one sentence: AI responses fail in predictable ways — extra text around JSON, wrong types, missing fields, unexpected structures — and each failure mode has a specific recovery strategy that ranges from JSON extraction and type coercion to default values and graceful degradation.
Navigation: ← 4.6.c Verifying AI Responses · 4.6.e — Retry Strategies →
1. Common AI Response Failures
AI models fail in predictable patterns. Understanding these patterns lets you build targeted recovery strategies.
┌─────────────────────────────────────────────────────────────────────┐
│ AI RESPONSE FAILURE TAXONOMY │
│ │
│ Category 1: NOT VALID JSON │
│ ──────────────────────── │
│ • Extra text before/after JSON (most common) │
│ • Markdown code fences around JSON │
│ • Truncated JSON (context limit or max_tokens hit) │
│ • Completely non-JSON response ("I'd be happy to help...") │
│ │
│ Category 2: VALID JSON, WRONG SHAPE │
│ ───────────────────────────────── │
│ • Missing required fields │
│ • Extra unexpected fields (usually harmless) │
│ • Wrong nesting level (flat vs nested) │
│ • Array instead of object (or vice versa) │
│ │
│ Category 3: RIGHT SHAPE, WRONG TYPES │
│ ──────────────────────────────────── │
│ • Number as string ("82" instead of 82) │
│ • Boolean as string ("true" instead of true) │
│ • Enum value not in allowed list │
│ • null where a value is expected │
│ │
│ Category 4: RIGHT SHAPE AND TYPES, WRONG VALUES │
│ ──────────────────────────────────────────────── │
│ • Number out of range (confidence: 95 instead of 0.95) │
│ • String too short/long │
│ • Array with wrong number of elements │
│ • Cross-field inconsistency (score 20 but label "excellent") │
└─────────────────────────────────────────────────────────────────────┘
2. Extracting JSON from Text with Extra Content
This is the single most common AI failure: the model wraps its JSON in explanatory text.
Common patterns
Pattern 1: Text before JSON
────────────────────────────
"Sure! Here's the analysis:
{"sentiment": "positive", "confidence": 0.92}"
Pattern 2: Markdown code fences
────────────────────────────────
"```json
{"sentiment": "positive", "confidence": 0.92}
```"
Pattern 3: Text before AND after JSON
──────────────────────────────────────
"Based on my analysis, the result is:
{"sentiment": "positive", "confidence": 0.92}
I hope this helps!"
Pattern 4: Multiple JSON objects (take the first or last)
──────────────────────────────────────────────────────────
"Step 1 output: {"intermediate": true}
Final output: {"sentiment": "positive", "confidence": 0.92}"
JSON extraction utility
/**
* Extract JSON from a string that may contain extra text.
* Tries multiple strategies in order of reliability.
*/
function extractJSON(text: string): unknown {
// Strategy 1: Direct parse (fastest path — handles clean JSON)
try {
return JSON.parse(text);
} catch {
// Not clean JSON, try extraction strategies
}
// Strategy 2: Remove markdown code fences
const fenceMatch = text.match(/```(?:json)?\s*\n?([\s\S]*?)\n?\s*```/);
if (fenceMatch) {
try {
return JSON.parse(fenceMatch[1].trim());
} catch {
// Fenced content is not valid JSON either
}
}
// Strategy 3: Find the first { ... } or [ ... ] block
const jsonMatch = text.match(/(\{[\s\S]*\}|\[[\s\S]*\])/);
if (jsonMatch) {
try {
return JSON.parse(jsonMatch[1]);
} catch {
// Matched braces but not valid JSON (probably nested issue)
}
}
// Strategy 4: Find the LAST { ... } block (sometimes the final output is what you want)
const allObjects = [...text.matchAll(/\{[\s\S]*?\}/g)];
for (let i = allObjects.length - 1; i >= 0; i--) {
try {
return JSON.parse(allObjects[i][0]);
} catch {
continue;
}
}
// All strategies failed
throw new Error(`Could not extract JSON from response: ${text.substring(0, 200)}...`);
}
// Usage
const raw1 = 'Sure! Here is the result: {"sentiment": "positive", "score": 0.95}';
const parsed1 = extractJSON(raw1);
// { sentiment: 'positive', score: 0.95 }
const raw2 = '```json\n{"sentiment": "positive", "score": 0.95}\n```';
const parsed2 = extractJSON(raw2);
// { sentiment: 'positive', score: 0.95 }
More robust extraction with balanced brace matching
The regex approach above can fail with nested objects. Here is a more robust approach:
function extractJSONRobust(text: string): unknown {
// Try direct parse first
try {
return JSON.parse(text);
} catch {
// Continue to extraction
}
// Remove markdown code fences
const cleaned = text
.replace(/```json\s*\n?/g, '')
.replace(/```\s*\n?/g, '')
.trim();
try {
return JSON.parse(cleaned);
} catch {
// Continue
}
// Find balanced braces
const startIndex = cleaned.indexOf('{');
if (startIndex === -1) {
const arrayStart = cleaned.indexOf('[');
if (arrayStart === -1) {
throw new Error('No JSON object or array found in response');
}
return extractBalanced(cleaned, arrayStart, '[', ']');
}
return extractBalanced(cleaned, startIndex, '{', '}');
}
function extractBalanced(
text: string,
startIndex: number,
openChar: string,
closeChar: string,
): unknown {
let depth = 0;
let inString = false;
let escaped = false;
for (let i = startIndex; i < text.length; i++) {
const char = text[i];
if (escaped) {
escaped = false;
continue;
}
if (char === '\\') {
escaped = true;
continue;
}
if (char === '"') {
inString = !inString;
continue;
}
if (inString) continue;
if (char === openChar) depth++;
if (char === closeChar) depth--;
if (depth === 0) {
const jsonString = text.substring(startIndex, i + 1);
return JSON.parse(jsonString);
}
}
throw new Error('Unbalanced braces in JSON');
}
Integrating extraction into the validation pipeline
import { z } from 'zod';
async function validateAIOutput<T>(
rawContent: string,
schema: z.ZodSchema<T>,
): Promise<{ success: true; data: T } | { success: false; errors: string[] }> {
// Step 1: Extract JSON (handles extra text)
let parsed: unknown;
try {
parsed = extractJSON(rawContent);
} catch (err) {
return {
success: false,
errors: [`JSON extraction failed: ${(err as Error).message}`],
};
}
// Step 2: Validate with Zod
const result = schema.safeParse(parsed);
if (result.success) {
return { success: true, data: result.data };
}
return {
success: false,
errors: result.error.issues.map((i) => `${i.path.join('.')}: ${i.message}`),
};
}
3. Type Coercion Strategies
AI models frequently return the right value in the wrong type. Here are targeted strategies for each case.
String "82" to number 82
import { z } from 'zod';
// Option 1: z.coerce (simplest)
const Score1 = z.coerce.number();
Score1.parse('82'); // 82
Score1.parse(82); // 82
// Option 2: Union with transform (explicit)
const Score2 = z.union([
z.number(),
z.string().transform((val) => {
const num = Number(val);
if (isNaN(num)) throw new Error(`Cannot convert "${val}" to number`);
return num;
}),
]);
// Option 3: Preprocess (runs BEFORE schema validation)
const Score3 = z.preprocess(
(val) => (typeof val === 'string' ? Number(val) : val),
z.number().min(0).max(100),
);
String "true"/"false" to boolean
// z.coerce.boolean() is DANGEROUS — any non-empty string becomes true
// "false" → true (because "false" is truthy in JS)
// Safe approach:
const SafeBoolean = z.union([
z.boolean(),
z.literal('true').transform(() => true),
z.literal('false').transform(() => false),
z.literal('yes').transform(() => true),
z.literal('no').transform(() => false),
z.literal(1).transform(() => true),
z.literal(0).transform(() => false),
]);
SafeBoolean.parse(true); // true
SafeBoolean.parse('false'); // false (correct!)
SafeBoolean.parse('yes'); // true
SafeBoolean.parse(0); // false
Confidence 95 to 0.95 (wrong scale)
const ConfidenceSchema = z.number().transform((val) => {
// If the number is > 1, assume it's a percentage and convert
if (val > 1 && val <= 100) {
return val / 100;
}
return val;
}).pipe(z.number().min(0).max(1));
ConfidenceSchema.parse(0.95); // 0.95
ConfidenceSchema.parse(95); // 0.95
ConfidenceSchema.parse(0.5); // 0.5
ConfidenceSchema.parse(50); // 0.5
ConfidenceSchema.parse(150); // ✗ (150/100 = 1.5 > 1)
Comma-separated string to array
const TagsSchema = z.union([
z.array(z.string()),
z.string().transform((val) =>
val.split(',').map((t) => t.trim()).filter((t) => t.length > 0)
),
]);
TagsSchema.parse(['a', 'b', 'c']); // ['a', 'b', 'c']
TagsSchema.parse('machine learning, nlp, ai'); // ['machine learning', 'nlp', 'ai']
Building a flexible AI response schema
const FlexibleAISchema = z.object({
// Handle mixed-type confidence
confidence: z.preprocess(
(val) => {
if (typeof val === 'string') return parseFloat(val);
return val;
},
z.number().transform((n) => (n > 1 ? n / 100 : n)).pipe(z.number().min(0).max(1)),
),
// Handle boolean as string
is_reliable: z.preprocess(
(val) => {
if (val === 'true' || val === 'yes' || val === 1) return true;
if (val === 'false' || val === 'no' || val === 0) return false;
return val;
},
z.boolean(),
),
// Handle tags as string or array
categories: z.preprocess(
(val) => {
if (typeof val === 'string') return val.split(',').map((s) => s.trim());
return val;
},
z.array(z.string()),
),
});
4. Default Values for Missing Fields
When the AI omits a field, you can provide sensible defaults instead of failing.
Simple defaults
const AnalysisSchema = z.object({
sentiment: z.string(),
confidence: z.number().default(0),
language: z.string().default('unknown'),
tags: z.array(z.string()).default([]),
metadata: z.object({
model: z.string().default('unknown'),
version: z.string().default('1.0'),
}).default({}), // entire nested object defaults to {}
});
// AI returns minimal response
const result = AnalysisSchema.parse({
sentiment: 'positive',
// everything else is missing
});
console.log(result);
// {
// sentiment: 'positive',
// confidence: 0,
// language: 'unknown',
// tags: [],
// metadata: { model: 'unknown', version: '1.0' }
// }
Conditional defaults based on other fields
const ResponseSchema = z.object({
answer: z.string(),
confidence: z.number().min(0).max(1),
source: z.string().optional(),
}).transform((data) => ({
...data,
// If no source provided and confidence is high, mark as "model knowledge"
source: data.source || (data.confidence > 0.9 ? 'model_knowledge' : 'unverified'),
}));
ResponseSchema.parse({ answer: 'Paris', confidence: 0.99 });
// { answer: 'Paris', confidence: 0.99, source: 'model_knowledge' }
ResponseSchema.parse({ answer: 'Maybe Lisbon', confidence: 0.4 });
// { answer: 'Maybe Lisbon', confidence: 0.4, source: 'unverified' }
5. Graceful Degradation vs Hard Failure
Not all validation failures deserve the same response. Design a strategy based on severity.
The degradation ladder
Level 1: FULL SUCCESS
→ All fields valid, all constraints met
→ Use the data as-is
Level 2: PARTIAL SUCCESS
→ Core fields valid, optional fields missing or invalid
→ Use validated fields, apply defaults for the rest
Level 3: RECOVERABLE FAILURE
→ JSON is valid, some required fields wrong
→ Attempt type coercion, extraction, or transformation
→ If that works, use the recovered data with a warning flag
Level 4: RETRY-WORTHY FAILURE
→ Response is structurally wrong but the model can fix it
→ Retry with error feedback (see 4.6.e)
Level 5: HARD FAILURE
→ Response is completely unusable
→ Return an error to the user or use a fallback
Implementation
import { z } from 'zod';
// Strict schema — what we ideally want
const StrictSchema = z.object({
category: z.enum(['bug', 'feature', 'question']),
severity: z.enum(['low', 'medium', 'high', 'critical']),
summary: z.string().min(10),
tags: z.array(z.string()).min(1),
confidence: z.number().min(0).max(1),
});
// Lenient schema — minimum viable data
const LenientSchema = z.object({
category: z.enum(['bug', 'feature', 'question']),
severity: z.enum(['low', 'medium', 'high', 'critical']).default('medium'),
summary: z.string().min(1), // shorter minimum
tags: z.array(z.string()).default([]),
confidence: z.number().min(0).max(1).default(0),
});
type Classification = z.infer<typeof StrictSchema>;
interface ClassificationResult {
data: Classification;
quality: 'full' | 'partial' | 'degraded';
warnings: string[];
}
function classifyWithDegradation(rawContent: string): ClassificationResult | null {
// Step 1: Extract JSON
let parsed: unknown;
try {
parsed = extractJSON(rawContent);
} catch {
return null; // Hard failure — can't even extract JSON
}
// Step 2: Try strict validation
const strict = StrictSchema.safeParse(parsed);
if (strict.success) {
return { data: strict.data, quality: 'full', warnings: [] };
}
// Step 3: Try lenient validation
const lenient = LenientSchema.safeParse(parsed);
if (lenient.success) {
const warnings = strict.error.issues.map(
(i) => `Degraded: ${i.path.join('.')}: ${i.message}`
);
return {
data: lenient.data as Classification,
quality: 'partial',
warnings,
};
}
// Step 4: Try with type coercion
const CoercedSchema = z.object({
category: z.string().toLowerCase().pipe(
z.enum(['bug', 'feature', 'question'])
),
severity: z.string().toLowerCase().pipe(
z.enum(['low', 'medium', 'high', 'critical'])
).default('medium'),
summary: z.string().default('No summary provided'),
tags: z.union([
z.array(z.string()),
z.string().transform((s) => s.split(',').map((t) => t.trim())),
]).default([]),
confidence: z.coerce.number().min(0).max(1).default(0),
});
const coerced = CoercedSchema.safeParse(parsed);
if (coerced.success) {
return {
data: coerced.data as Classification,
quality: 'degraded',
warnings: [
...strict.error.issues.map((i) => `Original: ${i.path.join('.')}: ${i.message}`),
'Data was recovered via type coercion',
],
};
}
return null; // All recovery strategies failed
}
6. Logging Invalid Responses for Debugging
Invalid AI responses are gold for debugging and improving your system. Log them properly.
What to log
interface AIValidationLog {
// Identity
request_id: string;
timestamp: string;
// Input context
prompt_hash: string; // hash of system prompt (don't log full prompt — too large)
model: string;
temperature: number;
// Raw output
raw_response: string; // FULL raw response — essential for debugging
raw_response_length: number;
// Validation outcome
json_parseable: boolean;
json_parse_error: string | null;
zod_valid: boolean;
zod_errors: Array<{
path: string;
code: string;
message: string;
received?: string;
expected?: string;
}>;
// Recovery
recovery_attempted: boolean;
recovery_strategy: string | null; // 'extraction' | 'coercion' | 'defaults' | 'retry'
recovery_successful: boolean;
// Performance
api_latency_ms: number;
validation_latency_ms: number;
tokens_used: { input: number; output: number };
}
Aggregation queries you should build
1. What % of AI responses fail validation?
→ SELECT COUNT(CASE WHEN zod_valid = false) / COUNT(*) FROM ai_validation_logs
2. Which fields fail most often?
→ SELECT path, COUNT(*) FROM ai_validation_errors GROUP BY path ORDER BY COUNT(*) DESC
3. What's the most common error type?
→ SELECT code, COUNT(*) FROM ai_validation_errors GROUP BY code ORDER BY COUNT(*) DESC
4. Does a specific model version have higher failure rates?
→ SELECT model, COUNT(CASE WHEN zod_valid = false) / COUNT(*) as fail_rate
FROM ai_validation_logs GROUP BY model
5. Are failure rates increasing over time?
→ SELECT DATE(timestamp), fail_rate FROM ai_validation_logs GROUP BY DATE(timestamp)
Simple in-memory log aggregator
class ValidationMetrics {
private logs: AIValidationLog[] = [];
record(log: AIValidationLog): void {
this.logs.push(log);
// Alert if failure rate spikes
const recentLogs = this.logs.slice(-100);
const failRate = recentLogs.filter((l) => !l.zod_valid).length / recentLogs.length;
if (failRate > 0.1 && recentLogs.length >= 50) {
console.warn(
`[ALERT] AI validation failure rate is ${(failRate * 100).toFixed(1)}% ` +
`(last ${recentLogs.length} requests)`
);
}
}
getTopFailingFields(limit = 10): Array<{ path: string; count: number }> {
const counts = new Map<string, number>();
for (const log of this.logs) {
for (const error of log.zod_errors) {
counts.set(error.path, (counts.get(error.path) || 0) + 1);
}
}
return [...counts.entries()]
.map(([path, count]) => ({ path, count }))
.sort((a, b) => b.count - a.count)
.slice(0, limit);
}
getFailureRate(): number {
if (this.logs.length === 0) return 0;
return this.logs.filter((l) => !l.zod_valid).length / this.logs.length;
}
getSummary(): string {
return [
`Total requests: ${this.logs.length}`,
`Failure rate: ${(this.getFailureRate() * 100).toFixed(1)}%`,
`JSON parse failures: ${this.logs.filter((l) => !l.json_parseable).length}`,
`Schema validation failures: ${this.logs.filter((l) => l.json_parseable && !l.zod_valid).length}`,
`Recovery attempts: ${this.logs.filter((l) => l.recovery_attempted).length}`,
`Recovery successes: ${this.logs.filter((l) => l.recovery_successful).length}`,
`Top failing fields: ${JSON.stringify(this.getTopFailingFields(5))}`,
].join('\n');
}
}
7. Handling Truncated JSON
When the AI response hits max_tokens, the JSON may be cut off mid-stream:
{"summary": "This is a long analysis of the document that covers multiple topics including
Detection and repair strategies
function repairTruncatedJSON(text: string): unknown {
// Try normal parse first
try {
return JSON.parse(text);
} catch {
// Continue to repair
}
// Extract JSON portion
let json = text;
const startIdx = json.indexOf('{');
if (startIdx > 0) {
json = json.substring(startIdx);
}
// Count open braces/brackets
let openBraces = 0;
let openBrackets = 0;
let inString = false;
let escaped = false;
for (const char of json) {
if (escaped) { escaped = false; continue; }
if (char === '\\') { escaped = true; continue; }
if (char === '"') { inString = !inString; continue; }
if (inString) continue;
if (char === '{') openBraces++;
if (char === '}') openBraces--;
if (char === '[') openBrackets++;
if (char === ']') openBrackets--;
}
// If we're inside a string, close it
if (inString) {
json += '"';
}
// Close open brackets and braces
json += ']'.repeat(Math.max(0, openBrackets));
json += '}'.repeat(Math.max(0, openBraces));
try {
return JSON.parse(json);
} catch {
throw new Error('Could not repair truncated JSON');
}
}
// Example: truncated response
const truncated = '{"summary": "The market shows strong growth in Q4 with';
try {
const repaired = repairTruncatedJSON(truncated);
console.log(repaired);
// { summary: 'The market shows strong growth in Q4 with' }
} catch {
console.error('Repair failed');
}
Warning: Repaired JSON may have truncated values. Always validate with Zod after repair, and flag the result as potentially incomplete.
8. Putting It All Together: The Defense-in-Depth Approach
import { z } from 'zod';
const AnalysisSchema = z.object({
sentiment: z.enum(['positive', 'negative', 'neutral']),
confidence: z.number().min(0).max(1),
summary: z.string().min(10),
topics: z.array(z.string()).min(1),
});
type Analysis = z.infer<typeof AnalysisSchema>;
interface DefenseResult {
data: Analysis | null;
stage_reached: 'direct_parse' | 'extraction' | 'coercion' | 'repair' | 'failed';
warnings: string[];
}
function defendedValidation(rawContent: string): DefenseResult {
const warnings: string[] = [];
// Layer 1: Direct JSON parse + strict validation
try {
const direct = JSON.parse(rawContent);
const result = AnalysisSchema.safeParse(direct);
if (result.success) {
return { data: result.data, stage_reached: 'direct_parse', warnings };
}
} catch {
warnings.push('Direct JSON parse failed, attempting extraction');
}
// Layer 2: Extract JSON from text
let extracted: unknown;
try {
extracted = extractJSON(rawContent);
const result = AnalysisSchema.safeParse(extracted);
if (result.success) {
warnings.push('JSON was extracted from surrounding text');
return { data: result.data, stage_reached: 'extraction', warnings };
}
} catch {
warnings.push('JSON extraction failed');
}
// Layer 3: Type coercion
if (extracted) {
const CoercedSchema = z.object({
sentiment: z.string().toLowerCase().pipe(
z.enum(['positive', 'negative', 'neutral'])
),
confidence: z.coerce.number().transform((n) => n > 1 ? n / 100 : n)
.pipe(z.number().min(0).max(1)),
summary: z.coerce.string().pipe(z.string().min(1)),
topics: z.union([
z.array(z.string()),
z.string().transform((s) => s.split(',').map((t) => t.trim())),
]).pipe(z.array(z.string()).min(1)),
});
const coerced = CoercedSchema.safeParse(extracted);
if (coerced.success) {
warnings.push('Data required type coercion');
return {
data: coerced.data as Analysis,
stage_reached: 'coercion',
warnings,
};
}
}
// Layer 4: Truncation repair
try {
const repaired = repairTruncatedJSON(rawContent);
const result = AnalysisSchema.safeParse(repaired);
if (result.success) {
warnings.push('JSON appeared truncated and was repaired');
return { data: result.data, stage_reached: 'repair', warnings };
}
} catch {
warnings.push('Truncation repair failed');
}
// All layers failed
return { data: null, stage_reached: 'failed', warnings };
}
9. Key Takeaways
- AI failures are predictable — extra text around JSON, wrong types, missing fields, and truncation cover 95% of cases. Build handlers for each.
- JSON extraction (removing surrounding text, markdown fences) should be your first recovery step — it's the most common failure mode.
- Type coercion (string "82" to number 82, string "true" to boolean true) handles the second most common failure mode. Use z.coerce or z.preprocess, but be careful with edge cases.
- Default values provide resilience for optional fields, but never default required business-critical fields silently — flag them as degraded.
- Graceful degradation is a ladder: try strict validation first, then lenient, then coercion, then defaults. Each step down should increment a warning counter.
- Log every failure with the full raw response, error details, and recovery outcome. This data drives prompt improvement and model evaluation.
- Truncation repair is a last resort. Repaired JSON may have truncated values that pass type checks but contain incomplete data.
Explain-It Challenge
- An AI returns
Sure! Here's the analysis:\n\n```json\n{"score": 85}\n```\n\nHope that helps!. Write the extraction code that handles this and explain why simpleJSON.parse()fails. - Your confidence field keeps getting values like 85, 92, 7 (percentages) instead of 0.85, 0.92, 0.07 (decimals). Design a Zod schema that normalizes both formats and explain the edge case where a value of 1 is ambiguous.
- Your team debates whether to "fail fast" (reject any invalid AI response) or "degrade gracefully" (salvage what you can). List three scenarios where each approach is correct.
Navigation: ← 4.6.c Verifying AI Responses · 4.6.e — Retry Strategies →