Episode 4 — Generative AI Engineering / 4.3 — Prompt Engineering Fundamentals

4.3.a — Writing Clear Instructions

In one sentence: The single biggest improvement you can make to any LLM prompt is writing specific, unambiguous instructions — tell the model exactly what role to play, what steps to follow, what to include, what to avoid, and what format to use.

Navigation: ← 4.3 Overview · 4.3.b — Few-Shot Examples →

1. Why Vague Prompts Produce Bad Results

When you give a vague prompt, the model has to guess what you want. It fills in the blanks with the most statistically likely interpretation — which is often a generic, surface-level response.

VAGUE PROMPT:
  "Write about JavaScript."

MODEL'S PROBLEM:
  - Write WHAT about JavaScript? An essay? A tutorial? A poem?
  - For WHOM? Beginners? Experts? Hiring managers?
  - How LONG? 50 words? 5000 words?
  - What ASPECT? History? Syntax? Frameworks? Performance?
  - What TONE? Formal? Casual? Technical?

RESULT: A generic, unfocused 300-word overview that helps nobody.

The model is not "bad" — it is underspecified. Think of it like delegating work to a human: "write about JavaScript" would produce confusion from a colleague too. The more specific your instructions, the more useful the output.

SPECIFIC PROMPT:
  "You are a senior JavaScript developer writing documentation for
   junior engineers. Explain closures in JavaScript in exactly 3 sections:
   1. What a closure is (2 sentences)
   2. A simple code example with comments
   3. One common use case in real applications
   Keep the total length under 200 words. Use casual but technical tone."

RESULT: Precisely what you asked for — focused, structured, right length.

2. The Anatomy of a Clear Instruction

Every well-engineered prompt addresses these six dimensions:

┌─────────────────────────────────────────────────────────────────┐
│                 SIX DIMENSIONS OF A CLEAR PROMPT                 │
│                                                                 │
│  1. ROLE        Who is the model?        (persona/expertise)    │
│  2. TASK        What should it do?       (specific action verb) │
│  3. CONTEXT     What background is needed? (constraints, scope) │
│  4. FORMAT      What shape is the output?  (JSON, list, table)  │
│  5. TONE        How should it sound?       (formal, casual)     │
│  6. BOUNDARIES  What should it NOT do?     (limits, exclusions) │
└─────────────────────────────────────────────────────────────────┘

You don't always need all six, but the more you include, the more predictable the output.

3. Specificity, Constraints, and Examples

Be specific about the task

BAD:  "Summarize this article."
GOOD: "Summarize this article in exactly 3 bullet points. Each bullet 
       should be one sentence. Focus on the main argument, the key 
       evidence, and the conclusion."

Add constraints to narrow the output

Constraints eliminate ambiguity. They tell the model what the boundaries are:

const systemPrompt = `You are a product review analyzer.

Constraints:
- Respond in JSON format only
- Rate sentiment as "positive", "negative", or "neutral" (no other values)
- Keep the summary under 20 words
- If the review is in a non-English language, still respond in English
- Do not include any text outside the JSON object`;

Include examples when the task is complex

Sometimes describing what you want is harder than showing what you want. A single example can be worth paragraphs of instructions:

Classify the following customer message by department.

Example:
  Message: "My order hasn't arrived in 2 weeks"
  Department: Shipping

Now classify:
  Message: "The button on the app doesn't work"
  Department:

4. Persona and Role Assignment

Assigning a role or persona to the model is one of the most effective techniques. It frames the model's vocabulary, depth, and perspective.

Why personas work

When you say "You are a senior database engineer," the model draws on patterns associated with that expertise level. It uses more technical vocabulary, provides deeper explanations, and makes fewer beginner-friendly simplifications.

// Without persona — generic response
const messages = [
  { role: 'user', content: 'Explain database indexing.' }
];
// Output: Surface-level explanation anyone could write

// With persona — expert-level response
const messages = [
  { 
    role: 'system', 
    content: 'You are a senior database engineer with 15 years of experience at a Fortune 500 company. You explain concepts with precision, always include performance implications, and reference real-world scenarios from production systems.' 
  },
  { role: 'user', content: 'Explain database indexing.' }
];
// Output: Deep, practical explanation with query plans, B-tree details, 
//         and production war stories

Common persona patterns

Persona	Effect
`"You are a senior [X] engineer"`	Deep technical detail, production focus
`"You are a teacher explaining to a 10-year-old"`	Simple language, analogies, no jargon
`"You are a code reviewer at Google"`	Critical, detailed, focused on best practices
`"You are a technical writer"`	Clear, structured, well-formatted documentation
`"You are a security auditor"`	Focuses on vulnerabilities, edge cases, risks
`"You are a helpful customer support agent for [Company]"`	Friendly, solution-oriented, brand-appropriate

Combining persona with constraints

const systemPrompt = `You are a senior JavaScript code reviewer at a 
large tech company. Your reviews are:
- Concise (max 3 issues per review)
- Constructive (suggest fixes, not just problems)
- Prioritized (critical issues first, style issues last)
- Referenced (cite specific line numbers)

Never say "looks good to me" unless there are genuinely zero issues.
Never comment on naming conventions unless they cause confusion.`;

5. Step-by-Step Instruction Format

When the task has multiple parts, break it into numbered steps. This gives the model a clear execution plan and makes the output more predictable.

Why numbered steps work

Models process instructions sequentially. When you dump everything into a paragraph, the model may miss points or combine them incorrectly. Numbered steps create a checklist the model follows.

BAD (paragraph form):
"Analyze this code snippet. Tell me if there are any bugs, what the 
time complexity is, whether it follows best practices, and suggest 
improvements. Also explain what the code does."

GOOD (numbered steps):
"Analyze this code snippet by completing these steps in order:

1. Explain what the code does in one sentence
2. Identify any bugs (list each bug with the line number)
3. State the time complexity and space complexity
4. List any best practice violations
5. Suggest specific improvements (show the corrected code)"

Implementation in JavaScript

const systemPrompt = `You are a code analysis assistant.

When given a code snippet, follow these steps EXACTLY:

Step 1: SUMMARY
  Write one sentence describing what the code does.

Step 2: BUGS
  List every bug you find. For each bug:
  - Quote the problematic line
  - Explain what's wrong
  - Show the fix
  If no bugs found, write "No bugs found."

Step 3: COMPLEXITY
  State the time and space complexity using Big O notation.

Step 4: IMPROVEMENTS
  Suggest up to 3 improvements. For each:
  - What to change
  - Why it's better
  - Show the improved code

Do NOT skip any step. Do NOT combine steps.`;

const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  temperature: 0,
  messages: [
    { role: 'system', content: systemPrompt },
    { role: 'user', content: 'Analyze this code:\n\n```js\nfunction findMax(arr) {\n  let max = 0;\n  for (let i = 0; i <= arr.length; i++) {\n    if (arr[i] > max) max = arr[i];\n  }\n  return max;\n}\n```' }
  ],
});

6. Do vs Don't Phrasing

A critical pattern: tell the model what TO do, not just what NOT to do. "Don't" instructions are weaker because they tell the model to avoid something without specifying the alternative.

Why "Don't" is weaker

When you say "Don't use technical jargon," the model knows what to avoid but not what to use instead. It has to guess the alternative. When you say "Use simple language a high school student would understand," the model knows exactly what level to target.

WEAK (Don't phrasing):
  "Don't be verbose."
  "Don't use technical jargon."
  "Don't make up information."
  "Don't format as a paragraph."

STRONG (Do phrasing):
  "Keep each response under 50 words."
  "Use simple language a high school student would understand."
  "If you're unsure about a fact, say 'I'm not certain about this.'"
  "Format your response as a numbered list."

The best approach: Combine Do AND Don't

const systemPrompt = `You are a customer support agent for TechCorp.

DO:
- Answer questions about our products using the provided documentation
- Use a friendly, professional tone
- Include specific page/section references when citing documentation
- Suggest contacting human support for issues you can't resolve

DON'T:
- Make up product features or pricing that isn't in the documentation
- Promise refunds, discounts, or policy exceptions
- Discuss competitor products
- Share internal company information`;

Common "Do" replacements for "Don't" instructions

Don't (weaker)	Do (stronger)
Don't be vague	Be specific — include numbers, names, and dates
Don't be long	Keep your response under 100 words
Don't use jargon	Write at an 8th-grade reading level
Don't guess	If uncertain, say "I'm not sure" and explain why
Don't hallucinate	Base every claim on the provided documents; cite your source
Don't be rude	Maintain a warm, professional tone throughout
Don't include extra text	Respond ONLY with the JSON object, no explanation

7. Output Format Specification

Telling the model what shape the answer should take is just as important as telling it what content to produce. Without format instructions, the model defaults to free-form prose — which is hard to parse programmatically and inconsistent across calls.

// Without format specification
const response = await llm.chat('What are the top 3 JavaScript frameworks?');
// Might return: a paragraph, a numbered list, a table, or a mix

// With format specification
const response = await llm.chat(
  `List the top 3 JavaScript frameworks.
   For each, provide:
   - Name
   - Primary use case (one sentence)
   - GitHub stars (approximate)
   
   Format as a Markdown table with columns: Name | Use Case | Stars`
);
// Returns: a consistent Markdown table every time

Common format specifications

LISTS:
  "Respond with a numbered list. One item per line. No explanations."

JSON:
  "Respond with a JSON object matching this schema:
   { "name": string, "score": number, "tags": string[] }"

TABLES:
  "Format as a Markdown table with columns: Feature | Description | Status"

STRUCTURED SECTIONS:
  "Use these exact headers: ## Summary, ## Details, ## Recommendation"

SINGLE VALUE:
  "Respond with ONLY the number. No text, no explanation, just the number."

8. Before and After: Bad Prompts Made Good

Example 1: Summarization

BEFORE (bad):
  "Summarize this."

AFTER (good):
  "Summarize this article for a weekly engineering newsletter.
   Write exactly 3 bullet points:
   - Bullet 1: The main finding or announcement
   - Bullet 2: Why it matters for web developers
   - Bullet 3: One actionable takeaway
   Each bullet should be one sentence, under 25 words."

Example 2: Code generation

BEFORE (bad):
  "Write a function to validate email."

AFTER (good):
  "Write a JavaScript function called validateEmail that:
   - Takes a single string parameter called email
   - Returns a boolean (true if valid, false if invalid)
   - Uses a regex pattern (not a library)
   - Handles these edge cases: empty string, missing @, missing domain
   - Includes JSDoc comment with @param and @returns
   - Include 3 test cases as comments at the bottom
   Do not use any external libraries."

Example 3: Data extraction

BEFORE (bad):
  "Get the important info from this receipt."

AFTER (good):
  "Extract the following fields from this receipt text.
   Return a JSON object with exactly these keys:
   {
     "store_name": string,
     "date": string (YYYY-MM-DD format),
     "total": number (in dollars, no currency symbol),
     "items": [{ "name": string, "price": number }]
   }
   If a field cannot be found, use null.
   Do not include any text outside the JSON object."

Example 4: Analysis

BEFORE (bad):
  "Is this code good?"

AFTER (good):
  "You are a senior JavaScript developer conducting a code review.
   
   Evaluate this code on these criteria (rate each 1-5):
   1. Readability — Is the code easy to understand?
   2. Performance — Are there any performance concerns?
   3. Error handling — Are edge cases covered?
   4. Security — Are there any vulnerabilities?
   
   For each criterion:
   - Give the rating (1-5)
   - Explain your rating in one sentence
   - If rating < 4, provide a specific fix
   
   End with an overall verdict: APPROVE, REQUEST_CHANGES, or REJECT."

Example 5: Content creation

BEFORE (bad):
  "Write a blog post about React hooks."

AFTER (good):
  "You are a technical blogger writing for intermediate React developers.
   
   Write a blog post titled 'Understanding useEffect: The Mental Model 
   You Need' that follows this structure:
   
   1. Hook (opening paragraph, max 3 sentences) — start with a common
      mistake developers make with useEffect
   2. The wrong mental model (1 paragraph) — explain what people think
      useEffect does vs what it actually does
   3. The correct mental model (2 paragraphs with a code example)
   4. Three practical rules (numbered list with code for each)
   5. Conclusion (2 sentences)
   
   Total length: 600-800 words.
   Tone: conversational but technically precise.
   Include exactly 3 code snippets (JSX).
   Do not cover useLayoutEffect or custom hooks."

9. The System Prompt vs User Prompt Split

In the OpenAI (and most LLM) API, messages have roles: system, user, and assistant. Knowing what goes where is a fundamental prompt engineering skill.

const messages = [
  {
    role: 'system',
    content: `[WHO the model is + HOW it should behave]
    
    You are a data extraction assistant. You:
    - Always respond in valid JSON
    - Never add explanatory text outside the JSON
    - Use null for missing fields
    - Use ISO 8601 for dates`
  },
  {
    role: 'user',
    content: `[WHAT to do with WHAT input]
    
    Extract the event details from this text:
    "Join us for TechConf 2025 on March 15th at the Convention Center. 
     Tickets are $299 for general admission."`
  }
];

What goes where

System Prompt	User Prompt
Persona / role	Specific task for this request
Global rules and constraints	Input data to process
Output format specification	Request-specific instructions
Do / Don't rules	Follow-up questions
Tone and style guidelines	Context for this specific query

Key rule: The system prompt is constant across requests. The user prompt changes per request. Put stable instructions in system, variable content in user.

10. Iterative Prompt Refinement

Great prompts are not written in one attempt. They are iterated. Here is a practical workflow:

STEP 1: Write a basic prompt
  → Run it 5 times
  → Identify what's wrong with the outputs

STEP 2: Fix the most common problem
  → Is the output too long? Add length constraints.
  → Is the format wrong? Specify the exact format.
  → Is the tone wrong? Add persona/tone instructions.
  → Is it hallucinating? Add "base your answer only on..." 

STEP 3: Run again 5 times
  → Check if the fix worked
  → Identify the next most common problem

STEP 4: Repeat until output quality is consistent

SAVE: Version your prompts (store them in code, not in your head)

Prompt versioning in code

// prompts/extractReceipt.js
export const EXTRACT_RECEIPT_V1 = `Extract data from this receipt.`;

export const EXTRACT_RECEIPT_V2 = `Extract data from this receipt as JSON.
Return: { store, date, total, items }`;

export const EXTRACT_RECEIPT_V3 = `You are a receipt data extraction system.
Extract the following fields from the receipt text.
Return a JSON object with exactly these keys:
{
  "store_name": string,
  "date": string (YYYY-MM-DD),
  "total": number,
  "items": [{ "name": string, "price": number }]
}
If a field cannot be found, use null.
Respond ONLY with the JSON object.`;

// Use the latest version
export const EXTRACT_RECEIPT = EXTRACT_RECEIPT_V3;

11. Common Mistakes to Avoid

Mistake	Why It's a Problem	Fix
Too vague	Model guesses what you want	Add specificity on all 6 dimensions
Too long	Wastes tokens, key instructions get lost in the middle	Front-load important instructions, trim redundancy
Contradictory instructions	Model picks one randomly or produces confused output	Review for conflicts; have one person own the prompt
No format specification	Output shape changes every call	Always specify format explicitly
Only "Don't" rules	Model doesn't know what to do instead	Pair every "Don't" with a "Do"
Assuming the model remembers	Each API call is independent (no memory)	Include all context in every call
Not testing with edge cases	Prompt works for happy path, breaks for weird inputs	Test with empty input, very long input, non-English, adversarial input

12. Key Takeaways

Vague prompts produce vague outputs — the model can only be as specific as your instructions.
Six dimensions of a clear prompt: Role, Task, Context, Format, Tone, Boundaries.
Persona assignment frames the model's expertise level and vocabulary.
Numbered steps create a checklist the model follows reliably.
"Do" is stronger than "Don't" — tell the model what to do, not just what to avoid.
Always specify output format — especially when code will parse the response.
Iterate prompts — run 5 times, fix the most common failure, repeat.
Version your prompts in code — treat them as important as any other source code.

Explain-It Challenge

A junior developer's prompt returns inconsistent results every time. Their prompt is: "Help me with my code." Rewrite it using all six dimensions of a clear prompt.
Explain why "Don't use complicated words" is a weaker instruction than "Write at an 8th-grade reading level" — what's the model actually doing differently?
Your team's system prompt is 3,000 tokens long and performance is getting worse, not better. What could be happening and how would you fix it?

Navigation: ← 4.3 Overview · 4.3.b — Few-Shot Examples →