Single GPU Task Adaptation of Pathology Foundation Models for Whole Slide Image Analysis
Journal:
arXiv
Published Date:
Jun 5, 2025
Abstract
Pathology foundation models (PFMs) have emerged as powerful tools for
analyzing whole slide images (WSIs). However, adapting these pretrained PFMs
for specific clinical tasks presents considerable challenges, primarily due to
the availability of only weak (WSI-level) labels for gigapixel images,
necessitating multiple instance learning (MIL) paradigm for effective WSI
analysis. This paper proposes a novel approach for single-GPU \textbf{T}ask
\textbf{A}daptation of \textbf{PFM}s (TAPFM) that uses vision transformer
(\vit) attention for MIL aggregation while optimizing both for feature
representations and attention weights. The proposed approach maintains separate
computational graphs for MIL aggregator and the PFM to create stable training
dynamics that align with downstream task objectives during end-to-end
adaptation. Evaluated on mutation prediction tasks for bladder cancer and lung
adenocarcinoma across institutional and TCGA cohorts, TAPFM consistently
outperforms conventional approaches, with H-Optimus-0 (TAPFM) outperforming the
benchmarks. TAPFM effectively handles multi-label classification of actionable
mutations as well. Thus, TAPFM makes adaptation of powerful pre-trained PFMs
practical on standard hardware for various clinical applications.