TruthLens: Explainable DeepFake Detection for Face Manipulated and Fully Synthetic Data
Journal:
arXiv
Published Date:
Mar 20, 2025
Abstract
Detecting DeepFakes has become a crucial research area as the widespread use
of AI image generators enables the effortless creation of face-manipulated and
fully synthetic content, yet existing methods are often limited to binary
classification (real vs. fake) and lack interpretability. To address these
challenges, we propose TruthLens, a novel and highly generalizable framework
for DeepFake detection that not only determines whether an image is real or
fake but also provides detailed textual reasoning for its predictions. Unlike
traditional methods, TruthLens effectively handles both face-manipulated
DeepFakes and fully AI-generated content while addressing fine-grained queries
such as "Does the eyes/nose/mouth look real or fake?"
The architecture of TruthLens combines the global contextual understanding of
multimodal large language models like PaliGemma2 with the localized feature
extraction capabilities of vision-only models like DINOv2. This hybrid design
leverages the complementary strengths of both models, enabling robust detection
of subtle manipulations while maintaining interpretability. Extensive
experiments on diverse datasets demonstrate that TruthLens outperforms
state-of-the-art methods in detection accuracy (by 2-14%) and explainability,
in both in-domain and cross-data settings, generalizing effectively across
traditional and emerging manipulation techniques.