Episode 4 — Generative AI Engineering / 4.5 — Generating JSON Responses from LLMs

4.5 — Generating JSON Responses from LLMs: Quick Revision

Compact cheat sheet. Print-friendly.

How to use this material (instructions)

  1. Skim before labs or interviews.
  2. Drill gaps — reopen README.md4.5.a4.5.e.
  3. Practice4.5-Exercise-Questions.md.
  4. Polish answers4.5-Interview-Questions.md.

Core vocabulary

TermOne-liner
JSON moderesponse_format: { type: "json_object" } — guarantees valid JSON syntax
Structured Outputsresponse_format: { type: "json_schema" } — guarantees valid JSON + exact schema
Schema-based promptingEmbedding JSON structure/types/examples in the prompt to guide model output
Function callingModel returns function name + typed arguments instead of text (a.k.a. tool calling)
Tool useAnthropic's term for function calling
tool_callsResponse field containing function name + arguments when model wants to call a tool
tool_choiceControls whether model must/may/cannot call tools (auto, none, required, specific)
Assistant prefillingStarting Claude's response with { to force JSON output
Validate-or-retryParse → validate → (pass or retry with error feedback) → max retries → fallback
Type coercionSafely converting "30"30, "true"true when the model returns wrong types

JSON mode — the basics

// OpenAI: JSON mode
const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  response_format: { type: 'json_object' },  // Guarantees valid JSON
  temperature: 0,
  messages: [
    { role: 'system', content: 'Return JSON with "name" and "age".' },  // MUST mention JSON
    { role: 'user', content: 'Alice is 30.' }
  ],
});
const data = JSON.parse(response.choices[0].message.content);
JSON mode guarantees:     Valid syntax (parseable by JSON.parse())
JSON mode does NOT:       Enforce field names, types, or structure
Structured Outputs:       Guarantees syntax + exact schema match
Must mention "JSON":      Yes (OpenAI requirement when using json_object)
Check finish_reason:      "length" = truncated = invalid JSON

Anthropic / Claude approach

// Prompt-based: instruct JSON-only output
system: 'Respond with ONLY valid JSON, no other text.'

// Prefill technique: force JSON start
messages: [
  { role: 'user', content: 'Extract data...' },
  { role: 'assistant', content: '{' }   // Prefill
]
// Then prepend '{' to response: '{' + response.content[0].text

Schema prompting strategies

1. EXACT STRUCTURE:    { "name": "string", "age": number }
2. CONCRETE EXAMPLE:   Input: "Alice, 30" → Output: { "name": "Alice", "age": 30 }
3. TYPESCRIPT TYPES:   type User = { name: string; age: number; interests: string[] }
4. FIELD DESCRIPTIONS: "age" (integer): The person's age as a number, not a string
5. COMBINED:           TypeScript type + example + rules = maximum reliability
BEST PRACTICE:
  System message = schema + rules (fixed, reusable)
  User message   = data to process (variable per request)

Function calling vs JSON mode

JSON MODE:                         FUNCTION CALLING:
  Returns: message.content           Returns: message.tool_calls
  Purpose: return data               Purpose: trigger an action
  Schema:  prompt-guided             Schema:  defined in tool parameters
  Types:   not enforced              Types:   enforced by schema
  Follow-up: none                    Follow-up: send function result back
// Function calling flow
tools: [{ type: 'function', function: { name, description, parameters } }]
tool_choice: 'auto' | 'none' | 'required' | { type: 'function', function: { name } }

// Response
message.tool_calls[0].function.name       // Function name (string)
message.tool_calls[0].function.arguments  // JSON string — must JSON.parse()
message.tool_calls[0].id                  // ID for sending result back

Validation layers

Layer 1: SAFE PARSE
  try { JSON.parse(text) } catch { handle error }
  Check finish_reason !== "length" (truncation)

Layer 2: REQUIRED FIELDS
  Check all expected keys exist in parsed object

Layer 3: TYPE CHECK
  typeof data.age === 'number'
  Array.isArray(data.items)
  typeof data.name === 'string'

Layer 4: RANGE / CONSTRAINT
  0 <= score <= 100
  array.length >= 2 && array.length <= 5
  value in ['high', 'medium', 'low']

Layer 5: CLEAN
  Round floats to integers where needed
  Trim arrays to max length
  Strip extra fields
  Apply defaults for missing optional fields

Validate-or-retry pattern

for (attempt = 1..3):
  response = callAPI(messages)
  parsed   = JSON.parse(response)
  valid    = validate(parsed)
  
  if valid:  return clean(parsed)
  else:      messages.push(errorFeedback)   ← FEED ERRORS BACK

if all fail: return defaultResponse
KEY: Don't just retry blindly — tell the model WHAT was wrong
     "Your JSON had errors: score must be 0-100, got 150. Please fix."

Type coercion helpers

// Safe coercion (when rejection is too strict)
"30"30     (string to number)
"true"true   (string to boolean)
85.786     (float to integer via Math.round)
150100    (clamp to valid range)
"HIGH""high" (normalize enum)

The compatibility analysis pattern

System prompt:
  1. Role:    "You are a compatibility analyzer"
  2. Schema:  { compatibility_score, strengths, weaknesses, suggested_openers }
  3. Rules:   score 0-100, strengths 2-5 items, be specific not generic
  4. Limits:  "Return ONLY JSON, do not invent details"

API call:
  model: 'gpt-4o'
  temperature: 0
  max_tokens: 1024
  response_format: { type: 'json_object' }

Validation:
  score: integer, 0-100
  strengths: array of strings, 2-5 items
  weaknesses: array of strings, 0-4 items
  suggested_openers: array of strings, 2-3 items

Cost: ~$0.004 per analysis (~750-950 tokens)

Provider comparison

FeatureOpenAIAnthropicGoogle Gemini
JSON modejson_objectPrompt + prefillresponse_mime_type
Structured Outputsjson_schemaNot availableAvailable
Function callingtools + tool_choicetools + tool_choicetools
Arguments formatJSON string (parse it)Parsed objectJSON string
Force specific tooltool_choice: { function }tool_choice: { tool }Supported

Decision guide

Need to EXECUTE actions?      → Function calling
Need STRICT schema?           → Structured Outputs (or forced tool_choice)
Need valid JSON, flexible?    → JSON mode + schema prompt
Need multi-provider support?  → JSON mode + prompt + validation (works everywhere)
Simple prototype?             → JSON mode (lowest setup)

Validation by risk level

LOW  (UI suggestions):    Parse + basic type check + default on failure
MED  (database writes):   Parse + full schema + defaults + 1 retry
HIGH (financial/medical):  Parse + strict schema + range check + 3 retries + human fallback

Cost estimation

GPT-4o:
  Input:  $2.50 / 1M tokens
  Output: $10.00 / 1M tokens

Typical structured extraction:
  System prompt:  ~300-500 tokens  (fixed)
  User message:   ~100-300 tokens  (variable)
  Model output:   ~200-400 tokens
  Total:          ~600-1200 tokens
  Cost:           ~$0.003-0.006 per call

10K calls/day:  ~$30-60/day
100K calls/day: ~$300-600/day (before caching)

Common gotchas

GotchaFix
Forgot "JSON" in prompt (OpenAI)API returns error — mention JSON in system or user message
finish_reason: "length"Increase max_tokens — truncated JSON is invalid
Model returns wrong field namesUse TypeScript types or exact template in prompt
arguments is a string not objectJSON.parse(toolCall.function.arguments)
Retry loop doesn't improveFeed specific validation errors back to the model
Score clusters around 50-70Add score range guide in prompt (0-20, 21-40, etc.)
Claude wraps JSON in textUse prefill technique: { role: 'assistant', content: '{' }
Extra fields in responseStrip with allowlist of expected fields
null vs missing fieldInstruct model: "use null for missing, never omit fields"

End of 4.5 quick revision.