Episode 4 — Generative AI Engineering / 4.4 — Structured Output in AI Systems

4.4.c — Common Applications

In one sentence: Structured output transforms LLMs from conversational tools into production data engines — powering resume parsing, product metadata generation, content moderation, scoring systems, email classification, sentiment analysis, and document extraction with consistent, machine-readable responses.

Navigation: ← 4.4.b — How Structured Responses Help · 4.4.d — Designing Output Schemas →


1. Resume Parsing — Structured Candidate Data

One of the most common real-world applications of structured LLM output is extracting standardized data from resumes, which come in wildly different formats.

The problem

Resume formats in the wild:
  - PDF with two-column layout
  - Word document with tables
  - Plain text email body
  - LinkedIn export
  - Creative design PDF with icons and graphics

Each resume organizes information differently:
  - Some put education first, others put experience first
  - Some list skills in a sidebar, others inline
  - Some include objectives, others don't
  - Date formats: "2020-2023", "Jan 2020 - Present", "2020 to current"

The structured output solution

async function parseResume(resumeText) {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    temperature: 0,
    messages: [
      {
        role: 'system',
        content: `You are a resume parser. Extract structured data from the given resume text.
Respond with ONLY a JSON object in this exact format:
{
  "candidate": {
    "fullName": "string",
    "email": "string or null",
    "phone": "string or null",
    "location": "string or null",
    "linkedIn": "string or null",
    "portfolio": "string or null"
  },
  "summary": "string — 1-2 sentence professional summary",
  "experience": [
    {
      "title": "string — job title",
      "company": "string",
      "location": "string or null",
      "startDate": "YYYY-MM or null",
      "endDate": "YYYY-MM or 'present'",
      "highlights": ["string — key achievement or responsibility"]
    }
  ],
  "education": [
    {
      "degree": "string",
      "institution": "string",
      "graduationYear": number or null,
      "gpa": number or null
    }
  ],
  "skills": {
    "languages": ["string"],
    "frameworks": ["string"],
    "tools": ["string"],
    "soft": ["string"]
  },
  "certifications": ["string"],
  "totalYearsExperience": number
}`
      },
      { role: 'user', content: resumeText },
    ],
  });

  const parsed = JSON.parse(response.choices[0].message.content);
  return parsed;
}

// Usage
const candidateData = await parseResume(rawResumeText);

// Now you can do anything with structured data:
console.log(candidateData.candidate.fullName);          // "Jane Smith"
console.log(candidateData.skills.languages);             // ["JavaScript", "Python", "Go"]
console.log(candidateData.totalYearsExperience);         // 7
console.log(candidateData.experience[0].company);        // "Google"

// Store in database
await db.candidates.create({
  name: candidateData.candidate.fullName,
  email: candidateData.candidate.email,
  yearsExp: candidateData.totalYearsExperience,
  skills: candidateData.skills,
  rawData: candidateData,
});

// Filter candidates programmatically
const qualifiedCandidates = candidates.filter(c =>
  c.totalYearsExperience >= 5 &&
  c.skills.languages.includes('JavaScript') &&
  c.education.some(e => e.degree.includes('Computer Science'))
);

2. Product Metadata Generation — Title, Description, Tags, Category

E-commerce platforms need consistent product metadata. LLMs can generate this from raw product descriptions, supplier data, or even images.

Schema

const productMetadataSchema = {
  title: "string — SEO-optimized product title (50-80 chars)",
  shortDescription: "string — 1-2 sentence summary (under 160 chars)",
  longDescription: "string — detailed product description (200-500 words)",
  category: "string — primary category",
  subcategory: "string — specific subcategory",
  tags: ["string — searchable keyword tags (5-15 tags)"],
  attributes: {
    brand: "string",
    material: "string or null",
    color: "string or null",
    size: "string or null",
    weight: "string or null",
  },
  seoKeywords: ["string — keywords for search optimization"],
  targetAudience: "string — who this product is for",
};

Implementation

async function generateProductMetadata(rawDescription, supplierData) {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    temperature: 0.3, // Slight creativity for descriptions
    messages: [
      {
        role: 'system',
        content: `You are a product catalog specialist. Generate structured product metadata from the given information.
Respond with ONLY a JSON object:
{
  "title": "string (50-80 chars, SEO-optimized)",
  "shortDescription": "string (under 160 chars)",
  "longDescription": "string (200-500 words, compelling copy)",
  "category": "string",
  "subcategory": "string",
  "tags": ["string (5-15 relevant tags)"],
  "attributes": {
    "brand": "string",
    "material": "string or null",
    "color": "string or null",
    "size": "string or null",
    "weight": "string or null"
  },
  "seoKeywords": ["string (5-10 keywords)"],
  "targetAudience": "string"
}`
      },
      {
        role: 'user',
        content: `Raw description: ${rawDescription}\n\nSupplier data: ${JSON.stringify(supplierData)}`,
      },
    ],
  });

  const metadata = JSON.parse(response.choices[0].message.content);
  
  // Validate critical fields
  if (!metadata.title || metadata.title.length > 100) {
    throw new Error('Invalid title');
  }
  if (!metadata.tags || metadata.tags.length < 3) {
    throw new Error('Insufficient tags');
  }
  
  return metadata;
}

// Usage — batch processing entire catalog
async function processCatalog(products) {
  const results = [];
  for (const product of products) {
    try {
      const metadata = await generateProductMetadata(
        product.rawDescription,
        product.supplierData
      );
      results.push({ id: product.id, metadata, status: 'success' });
    } catch (error) {
      results.push({ id: product.id, error: error.message, status: 'failed' });
    }
  }
  
  console.log(`Processed: ${results.filter(r => r.status === 'success').length}/${products.length}`);
  return results;
}

3. Content Moderation Systems — Flagged, Reason, Severity

Content moderation requires fast, structured decisions that can be routed through automated workflows.

Schema

const moderationSchema = {
  flagged: "boolean — whether the content violates policies",
  severity: "number — 1 (minor) to 5 (critical)",
  categories: ["string — which policy categories are violated"],
  reason: "string — brief explanation of why content was flagged",
  confidence: "number — 0.0 to 1.0",
  suggestedAction: "'approve' | 'review' | 'remove' | 'ban'",
  excerpts: ["string — specific problematic excerpts from the content"],
};

Implementation

async function moderateContent(content, contentType = 'comment') {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    temperature: 0,
    messages: [
      {
        role: 'system',
        content: `You are a content moderation system. Analyze the given ${contentType} against community guidelines.

Policy categories: harassment, hate_speech, violence, sexual_content, spam, misinformation, self_harm, illegal_activity

Respond with ONLY a JSON object:
{
  "flagged": boolean,
  "severity": number (1-5, where 1=minor, 3=moderate, 5=critical),
  "categories": ["string — violated categories, empty array if none"],
  "reason": "string — brief explanation",
  "confidence": number (0.0-1.0),
  "suggestedAction": "approve" | "review" | "remove" | "ban",
  "excerpts": ["string — problematic text excerpts, empty if clean"]
}

Severity guide:
1 = borderline/minor (e.g., mild rudeness)
2 = low (e.g., mildly inappropriate language)
3 = moderate (e.g., targeted insults)
4 = high (e.g., hate speech, threats)
5 = critical (e.g., illegal content, imminent danger)`
      },
      { role: 'user', content: content },
    ],
  });

  const result = JSON.parse(response.choices[0].message.content);
  
  // Route based on structured data
  return result;
}

// Automated moderation pipeline
async function moderationPipeline(content) {
  const result = await moderateContent(content);
  
  if (!result.flagged) {
    // Auto-approve clean content
    await publishContent(content);
    return { action: 'approved', automated: true };
  }
  
  if (result.severity >= 4 && result.confidence > 0.9) {
    // Auto-remove high-severity, high-confidence violations
    await removeContent(content);
    await logModeration(content, result, 'auto_removed');
    return { action: 'removed', automated: true };
  }
  
  if (result.severity >= 3 || result.confidence < 0.8) {
    // Send to human review for moderate severity or low confidence
    await queueForReview(content, result);
    await logModeration(content, result, 'queued_for_review');
    return { action: 'queued', automated: false };
  }
  
  // Low severity — approve with a note
  await publishContent(content);
  await logModeration(content, result, 'approved_with_flag');
  return { action: 'approved_with_note', automated: true };
}

Dashboard analytics from structured data

// Because the output is structured, analytics are trivial
async function getModerationStats(timeRange) {
  const results = await db.moderationLogs.find({
    createdAt: { $gte: timeRange.start, $lte: timeRange.end },
  });
  
  return {
    totalReviewed: results.length,
    flaggedCount: results.filter(r => r.flagged).length,
    flagRate: results.filter(r => r.flagged).length / results.length,
    bySeverity: {
      1: results.filter(r => r.severity === 1).length,
      2: results.filter(r => r.severity === 2).length,
      3: results.filter(r => r.severity === 3).length,
      4: results.filter(r => r.severity === 4).length,
      5: results.filter(r => r.severity === 5).length,
    },
    byCategory: groupByCategory(results),
    avgConfidence: average(results.map(r => r.confidence)),
    autoActionRate: results.filter(r => r.automated).length / results.length,
  };
}

4. Compatibility/Scoring Engines — Score, Strengths, Weaknesses

From job-candidate matching to product recommendations to compatibility scoring, structured output enables complex evaluation systems.

Schema

const scoringSchema = {
  overallScore: "number — 0 to 100",
  breakdown: [
    {
      criterion: "string — what was evaluated",
      score: "number — 0 to 100",
      weight: "number — how important this criterion is (0.0-1.0)",
      reasoning: "string — why this score was given",
    },
  ],
  strengths: ["string — key strong points"],
  weaknesses: ["string — key areas of concern"],
  recommendation: "'strong_yes' | 'yes' | 'maybe' | 'no' | 'strong_no'",
  summary: "string — 2-3 sentence overall assessment",
};

Implementation: Job-candidate matching

async function scoreCandidate(jobDescription, candidateProfile) {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    temperature: 0,
    messages: [
      {
        role: 'system',
        content: `You are a recruitment scoring engine. Evaluate how well a candidate matches a job description.

Respond with ONLY a JSON object:
{
  "overallScore": number (0-100),
  "breakdown": [
    {
      "criterion": "string",
      "score": number (0-100),
      "weight": number (0.0-1.0),
      "reasoning": "string (1-2 sentences)"
    }
  ],
  "strengths": ["string (3-5 key strengths)"],
  "weaknesses": ["string (1-4 concerns or gaps)"],
  "recommendation": "strong_yes" | "yes" | "maybe" | "no" | "strong_no",
  "summary": "string (2-3 sentence assessment)"
}

Evaluation criteria (evaluate each one):
1. Technical skills match (weight: 0.30)
2. Experience level (weight: 0.25)
3. Education fit (weight: 0.15)
4. Industry relevance (weight: 0.15)
5. Leadership/soft skills (weight: 0.15)`
      },
      {
        role: 'user',
        content: `Job Description:\n${jobDescription}\n\nCandidate Profile:\n${JSON.stringify(candidateProfile, null, 2)}`,
      },
    ],
  });

  return JSON.parse(response.choices[0].message.content);
}

// Usage — rank candidates
async function rankCandidates(jobDescription, candidates) {
  const scored = await Promise.all(
    candidates.map(async (candidate) => {
      const score = await scoreCandidate(jobDescription, candidate);
      return { candidate, score };
    })
  );
  
  // Sort by overall score
  scored.sort((a, b) => b.score.overallScore - a.score.overallScore);
  
  // Filter by recommendation
  const recommended = scored.filter(s =>
    ['strong_yes', 'yes'].includes(s.score.recommendation)
  );
  
  // Generate report
  return {
    totalCandidates: candidates.length,
    recommended: recommended.length,
    rankings: scored.map((s, i) => ({
      rank: i + 1,
      name: s.candidate.fullName,
      overallScore: s.score.overallScore,
      recommendation: s.score.recommendation,
      topStrength: s.score.strengths[0],
      topConcern: s.score.weaknesses[0] || 'None',
    })),
  };
}

5. Email Classification — Intent, Urgency, Suggested Action

Automated email triage uses structured output to route messages to the right team with the right priority.

Schema

const emailClassificationSchema = {
  intent: "'inquiry' | 'complaint' | 'support_request' | 'feedback' | 'billing' | 'cancellation' | 'partnership' | 'spam' | 'other'",
  urgency: "'critical' | 'high' | 'medium' | 'low'",
  suggestedAction: "string — what should be done",
  department: "'sales' | 'support' | 'billing' | 'engineering' | 'management' | 'spam_filter'",
  summary: "string — 1-sentence summary of the email",
  customerSentiment: "'angry' | 'frustrated' | 'neutral' | 'happy' | 'grateful'",
  requiresHumanReview: "boolean",
  suggestedResponseTemplate: "string — template name or null",
  extractedEntities: {
    orderNumber: "string or null",
    accountId: "string or null",
    productMentioned: "string or null",
    amountMentioned: "number or null",
  },
};

Implementation

async function classifyEmail(emailSubject, emailBody, senderInfo) {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    temperature: 0,
    messages: [
      {
        role: 'system',
        content: `You are an email classification system for a SaaS company.
Classify the incoming email and extract key information.

Respond with ONLY a JSON object:
{
  "intent": "inquiry" | "complaint" | "support_request" | "feedback" | "billing" | "cancellation" | "partnership" | "spam" | "other",
  "urgency": "critical" | "high" | "medium" | "low",
  "suggestedAction": "string",
  "department": "sales" | "support" | "billing" | "engineering" | "management" | "spam_filter",
  "summary": "string (1 sentence)",
  "customerSentiment": "angry" | "frustrated" | "neutral" | "happy" | "grateful",
  "requiresHumanReview": boolean,
  "suggestedResponseTemplate": "string or null",
  "extractedEntities": {
    "orderNumber": "string or null",
    "accountId": "string or null",
    "productMentioned": "string or null",
    "amountMentioned": number or null
  }
}

Urgency guide:
- critical: service down, security breach, legal threat
- high: billing dispute, cancellation request, angry customer
- medium: support request, feature inquiry, general complaint
- low: feedback, partnership inquiry, general question`
      },
      {
        role: 'user',
        content: `From: ${senderInfo}\nSubject: ${emailSubject}\n\n${emailBody}`,
      },
    ],
  });

  return JSON.parse(response.choices[0].message.content);
}

// Automated routing pipeline
async function routeEmail(email) {
  const classification = await classifyEmail(
    email.subject,
    email.body,
    email.from
  );
  
  // Auto-route to department
  await assignToDepartment(email.id, classification.department);
  
  // Set priority based on urgency
  const priorityMap = { critical: 1, high: 2, medium: 3, low: 4 };
  await setPriority(email.id, priorityMap[classification.urgency]);
  
  // Auto-respond with template if applicable
  if (classification.suggestedResponseTemplate && !classification.requiresHumanReview) {
    await sendAutoResponse(email.id, classification.suggestedResponseTemplate);
  }
  
  // Alert on critical issues
  if (classification.urgency === 'critical') {
    await sendSlackAlert('#critical-support', {
      summary: classification.summary,
      sentiment: classification.customerSentiment,
      from: email.from,
    });
  }
  
  // Enrich the ticket with extracted entities
  if (classification.extractedEntities.orderNumber) {
    await linkToOrder(email.id, classification.extractedEntities.orderNumber);
  }
  
  return classification;
}

6. Sentiment Analysis — Sentiment, Confidence, Aspects

Beyond simple positive/negative classification, structured sentiment analysis captures nuanced, multi-dimensional opinions.

Schema

const sentimentSchema = {
  overallSentiment: "'very_positive' | 'positive' | 'neutral' | 'negative' | 'very_negative' | 'mixed'",
  sentimentScore: "number — -1.0 (most negative) to 1.0 (most positive)",
  confidence: "number — 0.0 to 1.0",
  aspects: [
    {
      topic: "string — what aspect is being discussed",
      sentiment: "'positive' | 'negative' | 'neutral'",
      score: "number — -1.0 to 1.0",
      mention: "string — the relevant quote from the text",
    },
  ],
  emotions: {
    joy: "number 0-1",
    anger: "number 0-1",
    sadness: "number 0-1",
    surprise: "number 0-1",
    fear: "number 0-1",
    trust: "number 0-1",
  },
  summary: "string — 1-sentence sentiment summary",
};

Implementation

async function analyzeSentiment(text, context = 'product_review') {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    temperature: 0,
    messages: [
      {
        role: 'system',
        content: `You are a sentiment analysis engine for ${context}.
Perform detailed, aspect-based sentiment analysis on the given text.

Respond with ONLY a JSON object:
{
  "overallSentiment": "very_positive" | "positive" | "neutral" | "negative" | "very_negative" | "mixed",
  "sentimentScore": number (-1.0 to 1.0),
  "confidence": number (0.0 to 1.0),
  "aspects": [
    {
      "topic": "string",
      "sentiment": "positive" | "negative" | "neutral",
      "score": number (-1.0 to 1.0),
      "mention": "string (relevant quote)"
    }
  ],
  "emotions": {
    "joy": number (0-1),
    "anger": number (0-1),
    "sadness": number (0-1),
    "surprise": number (0-1),
    "fear": number (0-1),
    "trust": number (0-1)
  },
  "summary": "string (1-sentence summary)"
}`
      },
      { role: 'user', content: text },
    ],
  });

  return JSON.parse(response.choices[0].message.content);
}

// Batch analysis for analytics dashboard
async function analyzeReviewBatch(reviews) {
  const results = await Promise.all(
    reviews.map(review => analyzeSentiment(review.text, 'product_review'))
  );
  
  // Aggregate structured data for dashboard
  const avgScore = results.reduce((sum, r) => sum + r.sentimentScore, 0) / results.length;
  
  // Find most discussed aspects
  const allAspects = results.flatMap(r => r.aspects);
  const aspectCounts = {};
  allAspects.forEach(a => {
    if (!aspectCounts[a.topic]) {
      aspectCounts[a.topic] = { count: 0, totalScore: 0 };
    }
    aspectCounts[a.topic].count++;
    aspectCounts[a.topic].totalScore += a.score;
  });
  
  const topAspects = Object.entries(aspectCounts)
    .map(([topic, data]) => ({
      topic,
      mentions: data.count,
      avgScore: data.totalScore / data.count,
    }))
    .sort((a, b) => b.mentions - a.mentions);
  
  return {
    totalReviews: reviews.length,
    averageSentimentScore: avgScore,
    sentimentDistribution: {
      very_positive: results.filter(r => r.overallSentiment === 'very_positive').length,
      positive: results.filter(r => r.overallSentiment === 'positive').length,
      neutral: results.filter(r => r.overallSentiment === 'neutral').length,
      negative: results.filter(r => r.overallSentiment === 'negative').length,
      very_negative: results.filter(r => r.overallSentiment === 'very_negative').length,
      mixed: results.filter(r => r.overallSentiment === 'mixed').length,
    },
    topAspects: topAspects.slice(0, 10),
    avgEmotions: {
      joy: average(results.map(r => r.emotions.joy)),
      anger: average(results.map(r => r.emotions.anger)),
      trust: average(results.map(r => r.emotions.trust)),
    },
  };
}

function average(numbers) {
  return numbers.reduce((sum, n) => sum + n, 0) / numbers.length;
}

7. Data Extraction from Documents — Invoices, Receipts, Contracts

Extracting structured data from unstructured documents is one of the highest-value applications of structured LLM output.

Invoice extraction

async function extractInvoiceData(invoiceText) {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    temperature: 0,
    messages: [
      {
        role: 'system',
        content: `You are an invoice data extraction system. Extract all relevant information from the given invoice text.

Respond with ONLY a JSON object:
{
  "invoiceNumber": "string",
  "invoiceDate": "YYYY-MM-DD",
  "dueDate": "YYYY-MM-DD or null",
  "vendor": {
    "name": "string",
    "address": "string or null",
    "taxId": "string or null"
  },
  "customer": {
    "name": "string",
    "address": "string or null",
    "accountNumber": "string or null"
  },
  "lineItems": [
    {
      "description": "string",
      "quantity": number,
      "unitPrice": number,
      "totalPrice": number
    }
  ],
  "subtotal": number,
  "taxRate": number or null,
  "taxAmount": number or null,
  "discount": number or null,
  "totalAmount": number,
  "currency": "string (ISO 4217 code, e.g., USD, EUR)",
  "paymentTerms": "string or null",
  "notes": "string or null"
}

Important:
- All monetary values should be plain numbers (no currency symbols)
- Dates in YYYY-MM-DD format
- Use null for missing fields, never guess`
      },
      { role: 'user', content: invoiceText },
    ],
  });

  const invoice = JSON.parse(response.choices[0].message.content);
  
  // Validate totals
  const calculatedSubtotal = invoice.lineItems.reduce(
    (sum, item) => sum + item.totalPrice, 0
  );
  
  if (Math.abs(calculatedSubtotal - invoice.subtotal) > 0.01) {
    console.warn(
      `Subtotal mismatch: calculated ${calculatedSubtotal}, extracted ${invoice.subtotal}`
    );
  }
  
  return invoice;
}

// Usage
const invoiceData = await extractInvoiceData(rawInvoiceText);

// Integrate with accounting system
await accountingSystem.createPayable({
  vendorName: invoiceData.vendor.name,
  invoiceNumber: invoiceData.invoiceNumber,
  amount: invoiceData.totalAmount,
  currency: invoiceData.currency,
  dueDate: invoiceData.dueDate,
  lineItems: invoiceData.lineItems,
});

Receipt extraction

async function extractReceiptData(receiptText) {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    temperature: 0,
    messages: [
      {
        role: 'system',
        content: `Extract data from this receipt. Respond with ONLY JSON:
{
  "merchant": {
    "name": "string",
    "address": "string or null",
    "phone": "string or null"
  },
  "transactionDate": "YYYY-MM-DD",
  "transactionTime": "HH:MM or null",
  "items": [
    {
      "name": "string",
      "quantity": number,
      "price": number
    }
  ],
  "subtotal": number,
  "tax": number or null,
  "tip": number or null,
  "total": number,
  "paymentMethod": "cash" | "credit" | "debit" | "other" | null,
  "lastFourDigits": "string or null",
  "category": "food" | "groceries" | "transport" | "entertainment" | "shopping" | "utilities" | "health" | "other"
}`
      },
      { role: 'user', content: receiptText },
    ],
  });

  return JSON.parse(response.choices[0].message.content);
}

// Expense management system
async function processExpenseReceipt(receiptText, employeeId) {
  const receipt = await extractReceiptData(receiptText);
  
  await db.expenses.create({
    employeeId,
    merchant: receipt.merchant.name,
    amount: receipt.total,
    category: receipt.category,
    date: receipt.transactionDate,
    items: receipt.items,
    paymentMethod: receipt.paymentMethod,
    status: receipt.total > 500 ? 'needs_approval' : 'auto_approved',
  });
  
  return receipt;
}

Contract key terms extraction

async function extractContractTerms(contractText) {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    temperature: 0,
    messages: [
      {
        role: 'system',
        content: `Extract key terms from this contract. Respond with ONLY JSON:
{
  "contractType": "string (e.g., NDA, SaaS Agreement, Employment, Service Agreement)",
  "parties": [
    {
      "name": "string",
      "role": "string (e.g., Provider, Client, Employer, Contractor)"
    }
  ],
  "effectiveDate": "YYYY-MM-DD or null",
  "expirationDate": "YYYY-MM-DD or null",
  "autoRenewal": boolean,
  "renewalTerms": "string or null",
  "terminationClause": {
    "noticePeriodDays": number or null,
    "terminationForCause": "string or null",
    "terminationForConvenience": "string or null"
  },
  "financialTerms": {
    "totalValue": number or null,
    "paymentSchedule": "string or null",
    "currency": "string or null",
    "latePaymentPenalty": "string or null"
  },
  "keyObligations": [
    {
      "party": "string",
      "obligation": "string"
    }
  ],
  "liabilityLimit": "string or null",
  "governingLaw": "string — jurisdiction",
  "confidentialityClause": boolean,
  "nonCompeteClause": boolean,
  "nonCompeteDetails": "string or null",
  "riskFlags": ["string — potential issues or unusual terms"]
}

Important: Flag any unusual, aggressive, or potentially problematic terms in riskFlags.`
      },
      { role: 'user', content: contractText },
    ],
  });

  return JSON.parse(response.choices[0].message.content);
}

// Usage in a legal review pipeline
async function reviewContract(contractText) {
  const terms = await extractContractTerms(contractText);
  
  const alerts = [];
  
  // Automated checks on structured data
  if (terms.nonCompeteClause && terms.nonCompeteDetails) {
    alerts.push({ level: 'warning', message: 'Non-compete clause detected', details: terms.nonCompeteDetails });
  }
  
  if (terms.autoRenewal) {
    alerts.push({ level: 'info', message: 'Auto-renewal enabled', details: terms.renewalTerms });
  }
  
  if (terms.financialTerms.totalValue && terms.financialTerms.totalValue > 100000) {
    alerts.push({ level: 'warning', message: 'High-value contract requires VP approval' });
  }
  
  if (terms.riskFlags.length > 0) {
    alerts.push({ level: 'critical', message: 'Risk flags detected', details: terms.riskFlags });
  }
  
  return { terms, alerts, requiresLegalReview: alerts.some(a => a.level === 'critical') };
}

8. Pattern: Building Any Structured Output Application

All the applications above follow the same pattern. Here is the generalized approach:

// The universal structured output pattern
async function structuredLLMCall({
  systemPrompt,    // Describes the task + JSON schema
  userInput,       // The data to process
  model = 'gpt-4o',
  temperature = 0,
  validate,        // Optional validation function
  maxRetries = 2,
}) {
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    try {
      const response = await openai.chat.completions.create({
        model,
        temperature,
        messages: [
          { role: 'system', content: systemPrompt },
          { role: 'user', content: userInput },
        ],
      });

      let content = response.choices[0].message.content.trim();
      
      // Handle markdown code fences
      if (content.startsWith('```')) {
        content = content.replace(/^```(?:json)?\n?/, '').replace(/\n?```$/, '');
      }
      
      const parsed = JSON.parse(content);
      
      // Run custom validation if provided
      if (validate) {
        const validationResult = validate(parsed);
        if (!validationResult.valid) {
          throw new Error(`Validation failed: ${validationResult.errors.join(', ')}`);
        }
      }
      
      return parsed;
      
    } catch (error) {
      if (attempt === maxRetries) throw error;
      console.warn(`Attempt ${attempt + 1} failed: ${error.message}. Retrying...`);
      await new Promise(r => setTimeout(r, 1000 * Math.pow(2, attempt)));
    }
  }
}

// Example: Use the pattern for any application
const sentimentResult = await structuredLLMCall({
  systemPrompt: 'Analyze sentiment. Respond with JSON: {"sentiment": "positive"|"negative"|"neutral", "confidence": number}',
  userInput: 'I love this product!',
  validate: (data) => ({
    valid: data.sentiment && typeof data.confidence === 'number',
    errors: !data.sentiment ? ['Missing sentiment'] : !data.confidence ? ['Missing confidence'] : [],
  }),
});

9. Key Takeaways

  1. Resume parsing transforms wildly inconsistent resume formats into standardized candidate objects that enable programmatic filtering, scoring, and database storage.
  2. Product metadata generation produces SEO-optimized, consistently structured catalog data at scale — title, description, tags, category, and attributes.
  3. Content moderation uses structured output (flagged, severity, category) to power automated routing pipelines with human-in-the-loop for edge cases.
  4. Scoring engines output numerical scores with breakdowns, strengths, and weaknesses — enabling ranking, filtering, and reporting.
  5. Email classification extracts intent, urgency, and entities to auto-route messages, set priorities, and trigger appropriate responses.
  6. Sentiment analysis goes beyond positive/negative to capture aspect-level sentiment, emotions, and confidence — all as structured, aggregatable data.
  7. Document extraction (invoices, receipts, contracts) pulls key fields into structured objects that integrate directly with accounting, expense, and legal review systems.
  8. All applications follow the same pattern: define a schema, instruct the LLM, parse JSON, validate, and process.

Explain-It Challenge

  1. Your company processes 10,000 support emails per day. Explain how structured email classification saves time versus having humans read and categorize every email.
  2. A product manager asks "Can the AI grade job candidates?" Design the scoring schema you'd use and explain why each field matters.
  3. An accounting team says they spend 2 hours per day manually entering invoice data. Walk through how structured invoice extraction automates this and what safeguards you'd include.

Navigation: ← 4.4.b — How Structured Responses Help · 4.4.d — Designing Output Schemas →