Abn-BLIP: Abnormality-aligned Bootstrapping Language-Image Pre-training for Pulmonary Embolism Diagnosis and Report Generation from CTPA
Journal:
arXiv
Published Date:
Mar 3, 2025
Abstract
Medical imaging plays a pivotal role in modern healthcare, with computed
tomography pulmonary angiography (CTPA) being a critical tool for diagnosing
pulmonary embolism and other thoracic conditions. However, the complexity of
interpreting CTPA scans and generating accurate radiology reports remains a
significant challenge. This paper introduces Abn-BLIP (Abnormality-aligned
Bootstrapping Language-Image Pretraining), an advanced diagnosis model designed
to align abnormal findings to generate the accuracy and comprehensiveness of
radiology reports. By leveraging learnable queries and cross-modal attention
mechanisms, our model demonstrates superior performance in detecting
abnormalities, reducing missed findings, and generating structured reports
compared to existing methods. Our experiments show that Abn-BLIP outperforms
state-of-the-art medical vision-language models and 3D report generation
methods in both accuracy and clinical relevance. These results highlight the
potential of integrating multimodal learning strategies for improving radiology
reporting. The source code is available at https://github.com/zzs95/abn-blip.