Leveraging Transfer Learning and Multiple Instance Learning for HER2 Automatic Scoring of H\&E Whole Slide Images
Journal:
arXiv
Published Date:
Nov 5, 2024
Abstract
Expression of human epidermal growth factor receptor 2 (HER2) is an important
biomarker in breast cancer patients who can benefit from cost-effective
automatic Hematoxylin and Eosin (H\&E) HER2 scoring. However, developing such
scoring models requires large pixel-level annotated datasets. Transfer learning
allows prior knowledge from different datasets to be reused while
multiple-instance learning (MIL) allows the lack of detailed annotations to be
mitigated. The aim of this work is to examine the potential of transfer
learning on the performance of deep learning models pre-trained on (i)
Immunohistochemistry (IHC) images, (ii) H\&E images and (iii) non-medical
images. A MIL framework with an attention mechanism is developed using
pre-trained models as patch-embedding models. It was found that embedding
models pre-trained on H\&E images consistently outperformed the others,
resulting in an average AUC-ROC value of $0.622$ across the 4 HER2 scores
($0.59-0.80$ per HER2 score). Furthermore, it was found that using
multiple-instance learning with an attention layer not only allows for good
classification results to be achieved, but it can also help with producing
visual indication of HER2-positive areas in the H\&E slide image by utilising
the patch-wise attention weights.