Generalizable self-supervised learning for brain CTA in acute stroke.

Journal: Computers in biology and medicine
Published Date:

Abstract

Acute stroke management involves rapid and accurate interpretation of CTA imaging data. However, generalizable models for multiple acute stroke tasks able to learn from unlabeled data do not exist. We propose a linear probed self-supervised contrastive learning utilizing 3D CTA images and the findings section of radiologists' reports for pretraining. Subsequently, the pretrained model was applied to four disparate tasks: large vessel occlusion (LVO) detection, acute ischemic stroke detection, acute ischemic stroke, intracerebral hemorrhage classification, and ischemic core volume prediction. The tasks chosen are particularly challenging as they cannot be directly extracted from the radiology reports findings with keywords. The difficulty is compounded by the 3D feature representation required by tasks such as LVO detection. All imaging models were trained from scratch. In the pretraining phase, our dataset comprised 1,542 pairs of 3D CTA brains and corresponding radiologists' reports from 3 sites without any additional labels. To test the generalizability, we performed fine-tuning and testing phase with labeled data from another site on CTA brains from 592 subjects. In our experiments, we evaluated the influence of linear probing during the pretraining phase and found that, on average, it enhanced our model's generalizability, as shown by the improved classification performance with the appropriate text encoder. Our findings indicate that the best-performing models exhibit robust generalization to out-of-distribution data for multiple tasks. In all scenarios, linear probing during pretraining yielded superior predictive performance compared to a standard strategy. Furthermore, pretraining with reports findings conferred significant performance advantages compared to training the imaging encoder solely on labeled data.

Authors

  • Yingjun Dong
    McWilliams School of Biomedical Informatics at the University of Texas Health Science Center at Houston, Houston, TX 77030, USA.
  • Samiksha Pachade
    Shri Guru Gobind Singhji Institute of Engineering and Technology, Nanded, India; School of Biomedical Informatics, University of Texas Health Science Center at Houston, USA.
  • Kirk Roberts
    The University of Texas Health Science Center at Houston, USA.
  • Xiaoqian Jiang
    School of Biomedical Informatics, University of Texas Health, Science Center at Houston, Houston, TX, USA.
  • Sunil A Sheth
  • Luca Giancardo
    Center for Precision Health, School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, United States.