Chest X-ray Foundation Model with Global and Local Representations Integration
Journal:
arXiv
Published Date:
Feb 7, 2025
Abstract
Chest X-ray (CXR) is the most frequently ordered imaging test, supporting
diverse clinical tasks from thoracic disease detection to postoperative
monitoring. However, task-specific classification models are limited in scope,
require costly labeled data, and lack generalizability to out-of-distribution
datasets. To address these challenges, we introduce CheXFound, a
self-supervised vision foundation model that learns robust CXR representations
and generalizes effectively across a wide range of downstream tasks. We
pretrain CheXFound on a curated CXR-1M dataset, comprising over one million
unique CXRs from publicly available sources. We propose a Global and Local
Representations Integration (GLoRI) module for downstream adaptations, by
incorporating disease-specific local features with global image features for
enhanced performance in multilabel classification. Our experimental results
show that CheXFound outperforms state-of-the-art models in classifying 40
disease findings across different prevalence levels on the CXR-LT 24 dataset
and exhibits superior label efficiency on downstream tasks with limited
training data. Additionally, CheXFound achieved significant improvements on new
tasks with out-of-distribution datasets, including opportunistic cardiovascular
disease risk estimation and mortality prediction. These results highlight
CheXFound's strong generalization capabilities, enabling diverse adaptations
with improved label efficiency. The project source code is publicly available
at https://github.com/RPIDIAL/CheXFound.