Semantic Consistency-Based Uncertainty Quantification for Factuality in Radiology Report Generation
Journal:
arXiv
Published Date:
Dec 5, 2024
Abstract
Radiology report generation (RRG) has shown great potential in assisting
radiologists by automating the labor-intensive task of report writing. While
recent advancements have improved the quality and coherence of generated
reports, ensuring their factual correctness remains a critical challenge.
Although generative medical Vision Large Language Models (VLLMs) have been
proposed to address this issue, these models are prone to hallucinations and
can produce inaccurate diagnostic information. To address these concerns, we
introduce a novel Semantic Consistency-Based Uncertainty Quantification
framework that provides both report-level and sentence-level uncertainties.
Unlike existing approaches, our method does not require modifications to the
underlying model or access to its inner state, such as output token logits,
thus serving as a plug-and-play module that can be seamlessly integrated with
state-of-the-art models. Extensive experiments demonstrate the efficacy of our
method in detecting hallucinations and enhancing the factual accuracy of
automatically generated radiology reports. By abstaining from high-uncertainty
reports, our approach improves factuality scores by $10$\%, achieved by
rejecting $20$\% of reports using the \texttt{Radialog} model on the MIMIC-CXR
dataset. Furthermore, sentence-level uncertainty flags the lowest-precision
sentence in each report with an $82.9$\% success rate. Our implementation is
open-source and available at https://github.com/BU-DEPEND-Lab/SCUQ-RRG.