Federated Foundation Model for GI Endoscopy Images
Journal:
arXiv
Published Date:
May 30, 2025
Abstract
Gastrointestinal (GI) endoscopy is essential in identifying GI tract
abnormalities in order to detect diseases in their early stages and improve
patient outcomes. Although deep learning has shown success in supporting GI
diagnostics and decision-making, these models require curated datasets with
labels that are expensive to acquire. Foundation models offer a promising
solution by learning general-purpose representations, which can be finetuned
for specific tasks, overcoming data scarcity. Developing foundation models for
medical imaging holds significant potential, but the sensitive and protected
nature of medical data presents unique challenges. Foundation model training
typically requires extensive datasets, and while hospitals generate large
volumes of data, privacy restrictions prevent direct data sharing, making
foundation model training infeasible in most scenarios. In this work, we
propose a FL framework for training foundation models for gastroendoscopy
imaging, enabling data to remain within local hospital environments while
contributing to a shared model. We explore several established FL algorithms,
assessing their suitability for training foundation models without relying on
task-specific labels, conducting experiments in both homogeneous and
heterogeneous settings. We evaluate the trained foundation model on three
critical downstream tasks--classification, detection, and segmentation--and
demonstrate that it achieves improved performance across all tasks,
highlighting the effectiveness of our approach in a federated,
privacy-preserving setting.