WeakSupCon: Weakly Supervised Contrastive Learning for Encoder Pre-training
Journal:
arXiv
Published Date:
Mar 6, 2025
Abstract
Weakly supervised multiple instance learning (MIL) is a challenging task
given that only bag-level labels are provided, while each bag typically
contains multiple instances. This topic has been extensively studied in
histopathological image analysis, where labels are usually available only at
the whole slide image (WSI) level, while each whole slide image can be divided
into thousands of small image patches for training. The dominant MIL approaches
take fixed patch features as inputs to address computational constraints and
ensure model stability. These features are commonly generated by encoders
pre-trained on ImageNet, foundation encoders pre-trained on large datasets, or
through self-supervised learning on local datasets. While the self-supervised
encoder pre-training on the same dataset as downstream MIL tasks helps mitigate
domain shift and generate better features, the bag-level labels are not
utilized during the process, and the features of patches from different
categories may cluster together, reducing classification performance on MIL
tasks. Recently, pre-training with supervised contrastive learning (SupCon) has
demonstrated superior performance compared to self-supervised contrastive
learning and even end-to-end training on traditional image classification
tasks. In this paper, we propose a novel encoder pre-training method for
downstream MIL tasks called Weakly Supervised Contrastive Learning (WeakSupCon)
that utilizes bag-level labels. In our method, we employ multi-task learning
and define distinct contrastive learning losses for samples with different bag
labels. Our experiments demonstrate that the features generated using
WeakSupCon significantly enhance MIL classification performance compared to
self-supervised approaches across three datasets.