Holmes: Automated Fact Check with Large Language Models
Journal:
arXiv
Published Date:
May 6, 2025
Abstract
The rise of Internet connectivity has accelerated the spread of
disinformation, threatening societal trust, decision-making, and national
security. Disinformation has evolved from simple text to complex multimodal
forms combining images and text, challenging existing detection methods.
Traditional deep learning models struggle to capture the complexity of
multimodal disinformation. Inspired by advances in AI, this study explores
using Large Language Models (LLMs) for automated disinformation detection. The
empirical study shows that (1) LLMs alone cannot reliably assess the
truthfulness of claims; (2) providing relevant evidence significantly improves
their performance; (3) however, LLMs cannot autonomously search for accurate
evidence. To address this, we propose Holmes, an end-to-end framework featuring
a novel evidence retrieval method that assists LLMs in collecting high-quality
evidence. Our approach uses (1) LLM-powered summarization to extract key
information from open sources and (2) a new algorithm and metrics to evaluate
evidence quality. Holmes enables LLMs to verify claims and generate
justifications effectively. Experiments show Holmes achieves 88.3% accuracy on
two open-source datasets and 90.2% in real-time verification tasks. Notably,
our improved evidence retrieval boosts fact-checking accuracy by 30.8% over
existing methods