Hallucination detection methods in LLMs : a systematic literature review
Bestetti, Elisa (2025)
Bestetti, Elisa
2025
Tietojenkäsittelyopin maisteriohjelma - Master's Programme in Computer Science
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
Hyväksymispäivämäärä
2025-12-21
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-2025121211556
https://urn.fi/URN:NBN:fi:tuni-2025121211556
Tiivistelmä
Large Language Models (LLMs) are increasingly used across diverse applications, from healthcare and education to legal services and journalism. As their use expands into domains where factual accuracy is critical, the challenge of detecting and mitigating hallucinations (instances where LLMs generate plausible but incorrect or unfounded information) has become essential. This thesis conducts a systematic literature review on hallucination detection methods in LLMs, restricting the scope to detection methods that cannot make use of task-specific grounding documents (RAG, summarization or translation).
The review synthesizes 50 peer-reviewed studies published between 2023 and 2025, classifying detection strategies according to hallucination type (factuality or faithfulness) and technical approach (white-box or black-box). Findings reveal a predominance of factuality-focused methods and black-box techniques, reflecting practical constraints in accessing proprietary model internals. Prominent approaches include LLM-as-a-judge, knowledge graph techniques and fact-checking with external knowledge. Evaluation practices remain heterogeneous, relying on diverse benchmarks (the most used HaluEval and SelfCheckGPT) and metrics, which complicates comparison between methods.
The review synthesizes 50 peer-reviewed studies published between 2023 and 2025, classifying detection strategies according to hallucination type (factuality or faithfulness) and technical approach (white-box or black-box). Findings reveal a predominance of factuality-focused methods and black-box techniques, reflecting practical constraints in accessing proprietary model internals. Prominent approaches include LLM-as-a-judge, knowledge graph techniques and fact-checking with external knowledge. Evaluation practices remain heterogeneous, relying on diverse benchmarks (the most used HaluEval and SelfCheckGPT) and metrics, which complicates comparison between methods.
