DARC: Deep adaptive regularized clustering for histopathological image classification.

Journal: Medical image analysis
Published Date:

Abstract

In recent years, deep learning as a state-of-the-art machine learning technique has made great success in histopathological image classification. However, most of deep learning approaches rely heavily on the substantial task-specific annotations, which require experienced pathologists' manual labelling. As a result, they are laborious and time-consuming, and many unlabeled pathological images are difficult to use without experts' annotations. To mitigate the requirement for data annotation, we propose a self-supervised Deep Adaptive Regularized Clustering (DARC) framework to pre-train a neural network. DARC iteratively clusters the learned representations and utilizes the cluster assignments as pseudo-labels to learn the parameters of the network. To learn feasible representations and encourage the representations to become more discriminative, we design an objective function combining a network loss with a clustering loss using an adaptive regularization function, which is updated adaptively throughout the training process to learn feasible representations. The proposed DARC is evaluated on three public datasets, including NCT-CRC-HE-100K, PCam and LC25000. Compared to the strategy of training from scratch, fine-tuning using the pre-trained weights of DARC can obviously boost the accuracy of neural networks on histopathological classification. The accuracy of using the network trained using DARC pre-trained weights with only 10% labeled data is already comparable to the network trained from scratch with 100% training data. The network using DARC pre-trained weights achieves the fastest convergence speed on the downstream classification task. Moreover, visualization through t-distributed stochastic neighbor embedding (t-SNE) shows that the learned representations are generalizable and discriminative.

Authors

  • Junjian Li
    Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China.
  • Jin Liu
    School of Computer Science and Engineering, Central South University, Changsha, China.
  • Hailin Yue
    Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, 932 Lushan S Rd, Yuelu District, Changsha, Hunan, China.
  • Jianhong Cheng
    Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, 932 Lushan S Rd, Yuelu District, Changsha, Hunan, China.
  • Hulin Kuang
    From the Calgary Stroke Program, Departments of Clinical Neurosciences (W.Q., H.K., E.T., J.M.O., M.G., M.D.H., A.M.D., B.K.M.), Radiology (M.G., M.D.H., A.M.D., B.K.M.), and Community Health Sciences (M.D.H., B.K.M.), University of Calgary, 239 Strathridge Pl SW, Calgary, AB, Canada T3H 4J2; Hotchkiss Brain Institute, Calgary, Alberta, Canada (M.G., M.D.H., A.M.D., B.K.M.), Department of Neurology, Keimyung University, Daegu, South Korea (S.I.S.); and Division of Neuroradiology, Clinic of Radiology and Nuclear Medicine, University Hospital Basel, University of Basel, Basel, Switzerland (J.M.O.).
  • Harrison Bai
    Radiology and Radiological Science, Johns Hopkins University School of Medicine, Baltimore, MD, United States.
  • Yuping Wang
    Xuanwu Hospital Capital Medical University, Beijing 100053, China.
  • Jianxin Wang