Episode 4 — Generative AI Engineering / 4.1 — How LLMs Actually Work

4.1.d — Hallucination

In one sentence: LLMs hallucinate because they are next-token prediction machines that generate statistically plausible text, not factually verified answers — they don't "know" anything, they predict what text looks like it should come next.

Navigation: ← 4.1.c — Sampling & Temperature · 4.1.e — Deterministic vs Probabilistic →

1. What Is Hallucination?

Hallucination is when an LLM generates text that is factually wrong, fabricated, or not supported by the input — but presents it with the same confidence as correct information. The model doesn't flag uncertainty. It doesn't say "I'm not sure." It states the wrong answer as if it were fact.

User: "Who wrote the book 'The Quantum Garden' published in 2019?"

Model response (hallucinated):
  "The Quantum Garden was written by Derek Künsken, published by 
   Solaris Books in March 2019. It's the second novel in the
   Quantum Evolution series."

Reality check:
  ✓ Derek Künsken — correct author
  ✓ Solaris Books — correct publisher
  ✗ March 2019 — actually October 2019
  ✓ Quantum Evolution series — correct

The model got MOST things right but fabricated the month.
It presented the wrong date with the same confidence as the correct facts.

2. Why Hallucination Happens

The fundamental reason: prediction, not retrieval

An LLM is a statistical text completion engine. It was trained on billions of documents and learned patterns like:

"The capital of France is" → "Paris" (correct, because this pattern appeared millions of times)
"The inventor of the lightbulb is" → "Thomas Edison" (commonly stated, historically debatable)
"The 47th president of the United States is" → ??? (may generate a plausible-sounding but wrong answer)

The model doesn't have a database of facts it looks up. It has patterns of text it learned. When the patterns are strong (extremely common facts), the output is usually correct. When the patterns are weak (rare facts, recent events, niche topics), the model generates what looks right rather than what is right.

WHAT PEOPLE THINK:                    WHAT ACTUALLY HAPPENS:
┌──────────────────┐                  ┌──────────────────────┐
│ User asks question│                  │ User provides tokens  │
│        ↓         │                  │          ↓            │
│ Model looks up   │      VS          │ Model predicts the    │
│ the answer in    │                  │ most likely next       │
│ its knowledge    │                  │ tokens based on        │
│ database         │                  │ statistical patterns   │
│        ↓         │                  │          ↓            │
│ Returns the fact │                  │ Returns plausible text │
└──────────────────┘                  │ (which MAY be a fact)  │
                                      └──────────────────────┘

Specific causes of hallucination

Cause	Explanation	Example
Training data gaps	The fact wasn't in the training data, or was rare	Asking about events after the training cutoff
Conflicting sources	Training data contains contradictory information	Different dates for the same historical event
Pattern completion	The model fills in plausible-sounding details	Generating a fake but realistic-looking citation
Overconfidence	Models are trained to be helpful, not uncertain	Never saying "I don't know"
Instruction following	If you ask "what is X?", it generates an answer even if it shouldn't	"What color is the president's dog?" (may not exist)
Context confusion	In long prompts, the model mixes up details from different parts	Attributing one person's quote to another

3. Types of Hallucination

Factual hallucination

The model states something verifiably false about the real world:

"Albert Einstein won the Nobel Prize in Physics in 1921 for 
his work on the theory of relativity."

Reality: Einstein won in 1921, but for the photoelectric effect, 
NOT relativity. The model mixed up two related facts.

Fabricated citations

The model invents sources that don't exist:

"According to Smith et al. (2019) in the Journal of Machine Learning
Research, transformer models achieve 97.3% accuracy on the GLUE benchmark."

Reality: This paper, these authors, and this specific finding are 
COMPLETELY MADE UP. The journal exists, but the citation is fictional.

Intrinsic hallucination (contradicts the input)

The model contradicts information you provided in the prompt:

Prompt: "The meeting is scheduled for Tuesday at 3pm."
Model:  "I've noted that your meeting is on Wednesday at 3pm."

The model changed "Tuesday" to "Wednesday" despite the input being clear.

Extrinsic hallucination (adds unsupported claims)

The model adds information not present in the source material:

Prompt: "Summarize this article: [article about climate change in Arctic]"
Model:  "The article discusses Arctic ice loss and notes that 
         Antarctica is experiencing similar patterns."

The article said nothing about Antarctica. The model added
a plausible but unsupported claim.

4. Why Hallucination Is Dangerous in Production

┌─────────────────────────────────────────────────────────────┐
│                HALLUCINATION RISK BY DOMAIN                  │
│                                                             │
│  LOW RISK                              HIGH RISK            │
│  ├── Creative writing                  ├── Medical advice   │
│  ├── Brainstorming                     ├── Legal documents  │
│  ├── Marketing copy                    ├── Financial data   │
│  └── Casual conversation               ├── Code generation  │
│                                        ├── Academic citations│
│  (Wrong is "creative")                 └── Safety-critical  │
│                                        (Wrong is dangerous) │
└─────────────────────────────────────────────────────────────┘

Real-world incidents:

Lawyers cited fake cases: In 2023, a lawyer used ChatGPT to research case law. The model generated citations to cases that didn't exist. The lawyer submitted them to court, was sanctioned by the judge.
Medical misinformation: Chatbots have generated medically dangerous advice presented as factual.
Code hallucination: Models generate function calls to APIs that don't exist, or use library methods with wrong signatures.

5. Strategies to Reduce Hallucination

Strategy 1: Use RAG (Retrieval-Augmented Generation)

Instead of relying on the model's training data, retrieve relevant documents and include them in the prompt. The model generates answers based on the provided text, not its memory.

// Without RAG: model relies on training data (may hallucinate)
const response = await llm.chat("What is our company's refund policy?");

// With RAG: model answers based on actual documents
const relevantDocs = await vectorDB.search("refund policy");
const response = await llm.chat(
  `Based ONLY on the following documents, answer the question.
   If the answer is not in the documents, say "I don't know."
   
   Documents:
   ${relevantDocs.map(d => d.content).join('\n\n')}
   
   Question: What is our company's refund policy?`
);

Strategy 2: Temperature 0 for factual tasks

// Lower temperature = more deterministic = less creative hallucination
const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  temperature: 0,  // Pick the highest-probability token every time
  messages: [{ role: 'user', content: 'Extract the date from: ...' }],
});

Strategy 3: Instruct the model to say "I don't know"

System prompt:
"You are a helpful assistant. If you are not certain about a fact,
say 'I'm not sure about this' rather than guessing. Never fabricate
citations, statistics, or specific dates unless you are highly confident."

Strategy 4: Ask for sources and verify

User: "What were Apple's Q3 2024 revenue figures?"
System: "Provide your answer and cite the specific source. 
         If you cannot cite a real source, say so."

Strategy 5: Multi-step verification

// Step 1: Generate the answer
const answer = await llm.chat("What year was Python created?");

// Step 2: Ask the model to verify its own answer
const verification = await llm.chat(
  `You said: "${answer}". 
   Rate your confidence (1-10) and explain what you're uncertain about.`
);

// Step 3: If confidence is low, flag for human review

Strategy 6: Constrain output format

Structured outputs (JSON with specific fields) reduce hallucination because the model must fill specific slots rather than free-form narration.

6. Hallucination vs Creativity

An important nuance: hallucination is the same mechanism as creativity. When you want the model to write a poem, you WANT it to generate novel text that isn't a factual recounting. "Hallucination" is just what we call it when factual accuracy was expected but the model generated plausible fiction instead.

CREATIVE TASK:    "Write a story about a robot"  → Novel text = GOOD
FACTUAL TASK:     "What is Python's GIL?"        → Novel text = BAD (hallucination)
EXTRACTION TASK:  "Extract names from this text"  → Novel text = BAD (hallucination)

The skill of AI engineering is knowing which tasks benefit from creativity and which demand grounding.

7. Key Takeaways

Hallucination is inherent to how LLMs work — they predict plausible text, not verified facts.
Models present wrong information with the same confidence as correct information.
RAG is the primary strategy to ground model outputs in real data.
Temperature 0 reduces creative hallucination but doesn't eliminate factual errors.
Never trust LLM output for facts without verification, especially in high-stakes domains.
The same mechanism that causes hallucination also enables creativity — it's about matching the right task to the right settings.

Explain-It Challenge

A colleague says "the AI lied to me." Explain why "lying" is the wrong framing and what actually happened.
You're building a customer support chatbot. How would you minimize hallucination about company policies?
Why can't you solve hallucination by simply "training the model on more data"?

Navigation: ← 4.1.c — Sampling & Temperature · 4.1.e — Deterministic vs Probabilistic →