Episode 4 — Generative AI Engineering / 4.5 — Generating JSON Responses from LLMs
4.5.a — JSON Mode
In one sentence: JSON mode tells the LLM to only output valid JSON — no markdown, no explanation, no "here's the JSON:" prefix — by setting
response_format: { type: "json_object" }, which guarantees syntactically valid JSON but does not guarantee the structure matches your schema.
Navigation: ← 4.5 Overview · 4.5.b — Schema-Based Prompting →
1. The Problem: LLMs Love to Talk
By default, when you ask an LLM for JSON, you get something like this:
Sure! Here's the JSON you requested:
```json
{
"name": "Alice",
"age": 30
}
Hope that helps! Let me know if you need anything else.
That response contains valid JSON — buried inside markdown code fences and wrapped in conversational text. Your `JSON.parse()` call will fail because the response isn't **pure JSON**. You could try to extract the JSON with regex, but that's fragile and error-prone.
**JSON mode solves this.** When enabled, the model's output is guaranteed to be a valid JSON string — nothing before it, nothing after it, no markdown formatting.
---
## 2. How JSON Mode Works (OpenAI)
OpenAI introduced JSON mode via the `response_format` parameter:
```javascript
import OpenAI from 'openai';
const openai = new OpenAI();
const response = await openai.chat.completions.create({
model: 'gpt-4o',
response_format: { type: 'json_object' },
messages: [
{
role: 'system',
content: 'You are a helpful assistant. Respond in JSON format with a "name" and "age" field.'
},
{
role: 'user',
content: 'Tell me about Alice who is 30 years old.'
}
],
});
const data = JSON.parse(response.choices[0].message.content);
console.log(data);
// { name: "Alice", age: 30 }
What response_format: { type: "json_object" } does
- Constrains the model to only output valid JSON tokens — it cannot produce text that would break
JSON.parse(). - Eliminates wrapper text — no "Sure, here's the JSON:" preamble.
- Guarantees valid syntax — balanced braces, proper quoting, correct comma placement.
- Does NOT enforce structure — the model might return
{ "user": "Alice", "years": 30 }instead of{ "name": "Alice", "age": 30 }unless you tell it the schema.
Critical requirement: You MUST mention JSON in your prompt
// This will ERROR or produce unexpected results
const bad = await openai.chat.completions.create({
model: 'gpt-4o',
response_format: { type: 'json_object' },
messages: [
{ role: 'user', content: 'Tell me about Paris.' } // No mention of JSON!
],
});
// OpenAI will return an error:
// "When using JSON mode, you must include the word 'json' in the prompt"
Fix: Always instruct the model to respond in JSON somewhere in the system or user message.
// This works
const good = await openai.chat.completions.create({
model: 'gpt-4o',
response_format: { type: 'json_object' },
messages: [
{
role: 'system',
content: 'You are a travel guide. Always respond in JSON format.'
},
{ role: 'user', content: 'Tell me about Paris.' }
],
});
3. How Anthropic Handles JSON Output
Anthropic (Claude) does not have an identical response_format parameter. Instead, Claude offers several approaches:
Approach 1: Prompt-based JSON instruction
import Anthropic from '@anthropic-ai/sdk';
const anthropic = new Anthropic();
const response = await anthropic.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
system: 'You are a data extraction assistant. Always respond with ONLY valid JSON, no other text.',
messages: [
{
role: 'user',
content: 'Extract the name and age from: "Alice is 30 years old." Return JSON with "name" and "age" fields.'
}
],
});
const data = JSON.parse(response.content[0].text);
console.log(data);
// { name: "Alice", age: 30 }
Approach 2: Prefilling the assistant response
Claude supports prefilling — you start the assistant's response to force JSON output:
const response = await anthropic.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
system: 'You are a data extraction assistant.',
messages: [
{
role: 'user',
content: 'Extract the name and age from: "Alice is 30 years old." Return as JSON.'
},
{
role: 'assistant',
content: '{' // Prefill forces JSON output starting with {
}
],
});
// Note: the response continues from where the prefill left off
const jsonString = '{' + response.content[0].text;
const data = JSON.parse(jsonString);
console.log(data);
// { name: "Alice", age: 30 }
How prefilling works: By placing an opening brace { in the assistant turn, you tell Claude "your response has already started with { — continue from there." Claude will then continue generating valid JSON because it's completing a JSON object. This is a powerful technique that's unique to Anthropic's API.
Approach 3: Tool use for structured output
Claude also supports tool use (function calling) which provides schema-enforced structured output. We'll cover this in 4.5.c.
4. JSON Mode vs Free-Form with Parsing
Before JSON mode existed, developers had to extract JSON from free-form responses:
The old way (fragile)
// Ask the model normally
const response = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [
{
role: 'system',
content: 'Return your answer as a JSON object with "name" and "age" fields.'
},
{ role: 'user', content: 'Info about Alice, age 30.' }
],
// No response_format — free-form output
});
const raw = response.choices[0].message.content;
// raw might be: '```json\n{"name": "Alice", "age": 30}\n```'
// or: 'Here is the JSON:\n{"name": "Alice", "age": 30}'
// or: '{"name": "Alice", "age": 30}'
// Fragile extraction
function extractJSON(text) {
// Try direct parse first
try {
return JSON.parse(text);
} catch (e) {
// Try to find JSON in code fence
const match = text.match(/```(?:json)?\s*([\s\S]*?)```/);
if (match) {
return JSON.parse(match[1].trim());
}
// Try to find JSON object
const objMatch = text.match(/\{[\s\S]*\}/);
if (objMatch) {
return JSON.parse(objMatch[0]);
}
throw new Error('Could not extract JSON from response');
}
}
const data = extractJSON(raw);
The new way (JSON mode)
const response = await openai.chat.completions.create({
model: 'gpt-4o',
response_format: { type: 'json_object' },
messages: [
{
role: 'system',
content: 'Return a JSON object with "name" and "age" fields.'
},
{ role: 'user', content: 'Info about Alice, age 30.' }
],
});
// Guaranteed to be valid JSON — no extraction needed
const data = JSON.parse(response.choices[0].message.content);
Comparison
| Aspect | Free-form + Parsing | JSON Mode |
|---|---|---|
| Valid JSON guaranteed? | No — model might wrap in markdown, add commentary | Yes — always valid syntax |
| Extra parsing code? | Yes — regex extraction, multiple fallbacks | No — direct JSON.parse() |
| Failure rate | 5-15% of responses need extraction | <0.1% syntax errors |
| Schema enforcement? | No | No (just syntax) |
| Works with all models? | Yes — any LLM | Only models that support it |
| Token efficiency | Worse — model wastes tokens on wrapper text | Better — pure JSON only |
5. JSON Mode vs Structured Outputs
OpenAI also offers Structured Outputs — a stricter version of JSON mode that enforces a specific JSON Schema:
// JSON Mode — guarantees valid JSON, but not specific fields
const jsonMode = await openai.chat.completions.create({
model: 'gpt-4o',
response_format: { type: 'json_object' },
messages: [
{ role: 'system', content: 'Respond in JSON with name and age.' },
{ role: 'user', content: 'Alice is 30.' }
],
});
// Could return { "name": "Alice", "age": 30 }
// Could also return { "person": "Alice", "years_old": 30 } -- valid JSON, wrong schema!
// Structured Outputs — guarantees specific JSON Schema
const structured = await openai.chat.completions.create({
model: 'gpt-4o',
response_format: {
type: 'json_schema',
json_schema: {
name: 'person_info',
strict: true,
schema: {
type: 'object',
properties: {
name: { type: 'string', description: 'The person\'s name' },
age: { type: 'integer', description: 'The person\'s age' }
},
required: ['name', 'age'],
additionalProperties: false
}
}
},
messages: [
{ role: 'system', content: 'Extract person information.' },
{ role: 'user', content: 'Alice is 30.' }
],
});
// GUARANTEED to return exactly { "name": "...", "age": ... }
// with correct types and no extra fields
When to use which
| Feature | JSON Mode | Structured Outputs |
|---|---|---|
| Valid JSON | Yes | Yes |
| Schema enforcement | No — model chooses keys/structure | Yes — exact schema match |
| Type enforcement | No — age could be "30" or 30 | Yes — integer means integer |
| Required fields | No guarantee | Guaranteed |
| No extra fields | No guarantee | Guaranteed (with additionalProperties: false) |
| Flexibility | High — model decides structure | Low — locked to schema |
| Use case | Exploratory, simple tasks | Production pipelines, strict contracts |
| Prompt must mention JSON? | Yes | No (schema is sufficient) |
Decision guide
Do you need EXACT field names and types?
├── YES → Use Structured Outputs (json_schema)
└── NO
├── Do you need valid JSON syntax?
│ ├── YES → Use JSON Mode (json_object)
│ └── NO → Use free-form with parsing
└── Are you using function/tool calling?
└── YES → Use tools (covered in 4.5.c)
6. Practical Example: Profile Data Extraction
Let's build a practical example that extracts structured user profile data using JSON mode:
import OpenAI from 'openai';
const openai = new OpenAI();
async function extractProfile(bioText) {
const response = await openai.chat.completions.create({
model: 'gpt-4o',
temperature: 0, // Deterministic for consistent extraction
response_format: { type: 'json_object' },
messages: [
{
role: 'system',
content: `You are a profile data extractor. Given a user's bio text, extract their information into JSON format.
Return a JSON object with these fields:
- "name" (string): the person's name
- "age" (number): their age
- "interests" (array of strings): their hobbies and interests
- "location" (string or null): where they live, null if not mentioned
- "occupation" (string or null): their job, null if not mentioned`
},
{
role: 'user',
content: bioText
}
],
});
return JSON.parse(response.choices[0].message.content);
}
// Usage
const profile = await extractProfile(
"Hi! I'm Jordan, 28, living in Austin. I'm a software developer who " +
"loves hiking, cooking, and playing guitar on weekends."
);
console.log(profile);
// {
// name: "Jordan",
// age: 28,
// interests: ["hiking", "cooking", "playing guitar"],
// location: "Austin",
// occupation: "software developer"
// }
Same example with Anthropic (Claude)
import Anthropic from '@anthropic-ai/sdk';
const anthropic = new Anthropic();
async function extractProfileClaude(bioText) {
const response = await anthropic.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
temperature: 0,
system: `You are a profile data extractor. Given a user's bio text, extract their information.
Return ONLY a valid JSON object with these fields:
- "name" (string): the person's name
- "age" (number): their age
- "interests" (array of strings): their hobbies and interests
- "location" (string or null): where they live, null if not mentioned
- "occupation" (string or null): their job, null if not mentioned
Do not include any text outside the JSON object.`,
messages: [
{
role: 'user',
content: bioText
}
],
});
return JSON.parse(response.content[0].text);
}
7. Common Pitfalls
Pitfall 1: Forgetting to mention JSON in the prompt
// WRONG — will error with OpenAI
await openai.chat.completions.create({
response_format: { type: 'json_object' },
messages: [{ role: 'user', content: 'Tell me about dogs.' }],
// ...
});
// Error: "you must include the word 'json' in the prompt"
Pitfall 2: Assuming JSON mode enforces your schema
// JSON mode guarantees valid JSON, but NOT the right keys
const response = await openai.chat.completions.create({
model: 'gpt-4o',
response_format: { type: 'json_object' },
messages: [
{
role: 'system',
content: 'Return JSON with "first_name" and "last_name".'
},
{ role: 'user', content: 'Alice Smith' }
],
});
const data = JSON.parse(response.choices[0].message.content);
// MIGHT return: { "first_name": "Alice", "last_name": "Smith" }
// MIGHT return: { "name": "Alice Smith" } -- valid JSON, wrong keys!
// MIGHT return: { "firstName": "Alice", "lastName": "Smith" } -- camelCase instead
// Always validate the structure!
if (!data.first_name || !data.last_name) {
throw new Error('Response missing required fields');
}
Pitfall 3: Not handling the finish_reason
const response = await openai.chat.completions.create({
model: 'gpt-4o',
response_format: { type: 'json_object' },
max_tokens: 50, // Too small for the JSON!
messages: [
{
role: 'system',
content: 'Return JSON with a detailed profile analysis.'
},
{ role: 'user', content: 'Analyze this profile...' }
],
});
// Check finish_reason before parsing!
if (response.choices[0].finish_reason === 'length') {
// Output was truncated — JSON is likely incomplete/invalid
console.error('Response truncated — increase max_tokens');
} else {
const data = JSON.parse(response.choices[0].message.content);
}
Pitfall 4: JSON mode with streaming
When streaming with JSON mode, the JSON is emitted token-by-token. You cannot parse until the stream is complete:
const stream = await openai.chat.completions.create({
model: 'gpt-4o',
response_format: { type: 'json_object' },
messages: [
{ role: 'system', content: 'Return a JSON object with user info.' },
{ role: 'user', content: 'Alice, 30' }
],
stream: true,
});
let fullContent = '';
for await (const chunk of stream) {
const delta = chunk.choices[0]?.delta?.content || '';
fullContent += delta;
// DON'T try to JSON.parse(fullContent) here — it's incomplete!
}
// Parse only after stream is complete
const data = JSON.parse(fullContent);
8. JSON Mode Across Providers
| Provider | JSON Mode Support | How to Enable |
|---|---|---|
| OpenAI | Native | response_format: { type: "json_object" } |
| Anthropic | Via prompting + prefill | System prompt + assistant prefill with { |
| Google (Gemini) | Native | response_mime_type: "application/json" |
| Mistral | Native | response_format: { type: "json_object" } |
| Ollama / Local | Varies | format: "json" parameter |
| Azure OpenAI | Native | Same as OpenAI |
9. Key Takeaways
- JSON mode guarantees syntactically valid JSON output — no wrapper text, no markdown fences, no commentary.
- You MUST mention JSON in the prompt when using OpenAI's JSON mode — the API enforces this.
- JSON mode guarantees valid syntax but not correct structure — your schema might be ignored.
- Structured Outputs (
json_schema) go further and enforce an exact JSON Schema — use them for strict production contracts. - Always check
finish_reason— a truncated response ("length") produces invalid JSON. - Anthropic uses prompt engineering and assistant prefilling instead of a dedicated JSON mode parameter.
- Always validate after parsing — JSON mode is necessary but not sufficient for production use.
Explain-It Challenge
- A junior developer says "I enabled JSON mode so my output schema is guaranteed now." What's wrong with this assumption, and what would you add?
- Why does OpenAI require you to mention "JSON" in the prompt when JSON mode is enabled? What would happen if this requirement didn't exist?
- Explain the trade-off between JSON mode (flexible structure) and Structured Outputs (strict schema). When would you choose each?
Navigation: ← 4.5 Overview · 4.5.b — Schema-Based Prompting →