Episode 4 — Generative AI Engineering / 4.5 — Generating JSON Responses from LLMs

4.5 — Generating JSON Responses from LLMs: Quick Revision

Compact cheat sheet. Print-friendly.

How to use this material (instructions)

Skim before labs or interviews.
Drill gaps — reopen README.md → 4.5.a…4.5.e.
Practice — 4.5-Exercise-Questions.md.
Polish answers — 4.5-Interview-Questions.md.

Core vocabulary

Term	One-liner
JSON mode	`response_format: { type: "json_object" }` — guarantees valid JSON syntax
Structured Outputs	`response_format: { type: "json_schema" }` — guarantees valid JSON + exact schema
Schema-based prompting	Embedding JSON structure/types/examples in the prompt to guide model output
Function calling	Model returns function name + typed arguments instead of text (a.k.a. tool calling)
Tool use	Anthropic's term for function calling
tool_calls	Response field containing function name + arguments when model wants to call a tool
tool_choice	Controls whether model must/may/cannot call tools (`auto`, `none`, `required`, specific)
Assistant prefilling	Starting Claude's response with `{` to force JSON output
Validate-or-retry	Parse → validate → (pass or retry with error feedback) → max retries → fallback
Type coercion	Safely converting `"30"` → `30`, `"true"` → `true` when the model returns wrong types

JSON mode — the basics

// OpenAI: JSON mode
const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  response_format: { type: 'json_object' },  // Guarantees valid JSON
  temperature: 0,
  messages: [
    { role: 'system', content: 'Return JSON with "name" and "age".' },  // MUST mention JSON
    { role: 'user', content: 'Alice is 30.' }
  ],
});
const data = JSON.parse(response.choices[0].message.content);

JSON mode guarantees:     Valid syntax (parseable by JSON.parse())
JSON mode does NOT:       Enforce field names, types, or structure
Structured Outputs:       Guarantees syntax + exact schema match
Must mention "JSON":      Yes (OpenAI requirement when using json_object)
Check finish_reason:      "length" = truncated = invalid JSON

Anthropic / Claude approach

// Prompt-based: instruct JSON-only output
system: 'Respond with ONLY valid JSON, no other text.'

// Prefill technique: force JSON start
messages: [
  { role: 'user', content: 'Extract data...' },
  { role: 'assistant', content: '{' }   // Prefill
]
// Then prepend '{' to response: '{' + response.content[0].text

Schema prompting strategies

1. EXACT STRUCTURE:    { "name": "string", "age": number }
2. CONCRETE EXAMPLE:   Input: "Alice, 30" → Output: { "name": "Alice", "age": 30 }
3. TYPESCRIPT TYPES:   type User = { name: string; age: number; interests: string[] }
4. FIELD DESCRIPTIONS: "age" (integer): The person's age as a number, not a string
5. COMBINED:           TypeScript type + example + rules = maximum reliability

BEST PRACTICE:
  System message = schema + rules (fixed, reusable)
  User message   = data to process (variable per request)

Function calling vs JSON mode

JSON MODE:                         FUNCTION CALLING:
  Returns: message.content           Returns: message.tool_calls
  Purpose: return data               Purpose: trigger an action
  Schema:  prompt-guided             Schema:  defined in tool parameters
  Types:   not enforced              Types:   enforced by schema
  Follow-up: none                    Follow-up: send function result back

// Function calling flow
tools: [{ type: 'function', function: { name, description, parameters } }]
tool_choice: 'auto' | 'none' | 'required' | { type: 'function', function: { name } }

// Response
message.tool_calls[0].function.name       // Function name (string)
message.tool_calls[0].function.arguments  // JSON string — must JSON.parse()
message.tool_calls[0].id                  // ID for sending result back

Validation layers

Layer 1: SAFE PARSE
  try { JSON.parse(text) } catch { handle error }
  Check finish_reason !== "length" (truncation)

Layer 2: REQUIRED FIELDS
  Check all expected keys exist in parsed object

Layer 3: TYPE CHECK
  typeof data.age === 'number'
  Array.isArray(data.items)
  typeof data.name === 'string'

Layer 4: RANGE / CONSTRAINT
  0 <= score <= 100
  array.length >= 2 && array.length <= 5
  value in ['high', 'medium', 'low']

Layer 5: CLEAN
  Round floats to integers where needed
  Trim arrays to max length
  Strip extra fields
  Apply defaults for missing optional fields

Validate-or-retry pattern

for (attempt = 1..3):
  response = callAPI(messages)
  parsed   = JSON.parse(response)
  valid    = validate(parsed)
  
  if valid:  return clean(parsed)
  else:      messages.push(errorFeedback)   ← FEED ERRORS BACK

if all fail: return defaultResponse

KEY: Don't just retry blindly — tell the model WHAT was wrong
     "Your JSON had errors: score must be 0-100, got 150. Please fix."

Type coercion helpers

// Safe coercion (when rejection is too strict)
"30"    → 30     (string to number)
"true"  → true   (string to boolean)
85.7    → 86     (float to integer via Math.round)
150     → 100    (clamp to valid range)
"HIGH"  → "high" (normalize enum)

The compatibility analysis pattern

System prompt:
  1. Role:    "You are a compatibility analyzer"
  2. Schema:  { compatibility_score, strengths, weaknesses, suggested_openers }
  3. Rules:   score 0-100, strengths 2-5 items, be specific not generic
  4. Limits:  "Return ONLY JSON, do not invent details"

API call:
  model: 'gpt-4o'
  temperature: 0
  max_tokens: 1024
  response_format: { type: 'json_object' }

Validation:
  score: integer, 0-100
  strengths: array of strings, 2-5 items
  weaknesses: array of strings, 0-4 items
  suggested_openers: array of strings, 2-3 items

Cost: ~$0.004 per analysis (~750-950 tokens)

Provider comparison

Feature	OpenAI	Anthropic	Google Gemini
JSON mode	`json_object`	Prompt + prefill	`response_mime_type`
Structured Outputs	`json_schema`	Not available	Available
Function calling	`tools` + `tool_choice`	`tools` + `tool_choice`	`tools`
Arguments format	JSON string (parse it)	Parsed object	JSON string
Force specific tool	`tool_choice: { function }`	`tool_choice: { tool }`	Supported

Decision guide

Need to EXECUTE actions?      → Function calling
Need STRICT schema?           → Structured Outputs (or forced tool_choice)
Need valid JSON, flexible?    → JSON mode + schema prompt
Need multi-provider support?  → JSON mode + prompt + validation (works everywhere)
Simple prototype?             → JSON mode (lowest setup)

Validation by risk level

LOW  (UI suggestions):    Parse + basic type check + default on failure
MED  (database writes):   Parse + full schema + defaults + 1 retry
HIGH (financial/medical):  Parse + strict schema + range check + 3 retries + human fallback

Cost estimation

GPT-4o:
  Input:  $2.50 / 1M tokens
  Output: $10.00 / 1M tokens

Typical structured extraction:
  System prompt:  ~300-500 tokens  (fixed)
  User message:   ~100-300 tokens  (variable)
  Model output:   ~200-400 tokens
  Total:          ~600-1200 tokens
  Cost:           ~$0.003-0.006 per call

10K calls/day:  ~$30-60/day
100K calls/day: ~$300-600/day (before caching)

Common gotchas

Gotcha	Fix
Forgot "JSON" in prompt (OpenAI)	API returns error — mention JSON in system or user message
`finish_reason: "length"`	Increase max_tokens — truncated JSON is invalid
Model returns wrong field names	Use TypeScript types or exact template in prompt
`arguments` is a string not object	`JSON.parse(toolCall.function.arguments)`
Retry loop doesn't improve	Feed specific validation errors back to the model
Score clusters around 50-70	Add score range guide in prompt (0-20, 21-40, etc.)
Claude wraps JSON in text	Use prefill technique: `{ role: 'assistant', content: '{' }`
Extra fields in response	Strip with allowlist of expected fields
`null` vs missing field	Instruct model: "use null for missing, never omit fields"

End of 4.5 quick revision.