SimMIL: A Universal Weakly Supervised Pre-Training Framework for Multi-Instance Learning in Whole Slide Pathology Images

Journal: arXiv

Published Date: May 10, 2025

Abstract

Various multi-instance learning (MIL) based approaches have been developed and successfully applied to whole-slide pathological images (WSI). Existing MIL methods emphasize the importance of feature aggregators, but largely neglect the instance-level representation learning. They assume that the availability of a pre-trained feature extractor can be directly utilized or fine-tuned, which is not always the case. This paper proposes to pre-train feature extractor for MIL via a weakly-supervised scheme, i.e., propagating the weak bag-level labels to the corresponding instances for supervised learning. To learn effective features for MIL, we further delve into several key components, including strong data augmentation, a non-linear prediction head and the robust loss function. We conduct experiments on common large-scale WSI datasets and find it achieves better performance than other pre-training schemes (e.g., ImageNet pre-training and self-supervised learning) in different downstream tasks. We further show the compatibility and scalability of the proposed scheme by deploying it in fine-tuning the pathological-specific models and pre-training on merged multiple datasets. To our knowledge, this is the first work focusing on the representation learning for MIL.

Authors

Yicheng Song
Tiancheng Lin
Die Peng
Su Yang
Yi Xu

External Resources

View on arXiv arXiv (http://arxiv.org/abs/2505.06710v1)

SimMIL: A Universal Weakly Supervised Pre-Training Framework for Multi-Instance Learning in Whole Slide Pathology Images

Abstract

Authors

Categories

External Resources

Popular Topics

Recent Journals

SimMIL: A Universal Weakly Supervised Pre-Training Framework for Multi-Instance Learning in Whole Slide Pathology Images

Abstract

Authors

Categories

External Resources

Don't Miss the Future of Medicine

Popular Topics

Recent Journals