A real-world demonstration of machine learning generalizability in the detection of intracranial hemorrhage on head computerized tomography.

Journal: Scientific reports
Published Date:

Abstract

Machine learning (ML) holds great promise in transforming healthcare. While published studies have shown the utility of ML models in interpreting medical imaging examinations, these are often evaluated under laboratory settings. The importance of real world evaluation is best illustrated by case studies that have documented successes and failures in the translation of these models into clinical environments. A key prerequisite for the clinical adoption of these technologies is demonstrating generalizable ML model performance under real world circumstances. The purpose of this study was to demonstrate that ML model generalizability is achievable in medical imaging with the detection of intracranial hemorrhage (ICH) on non-contrast computed tomography (CT) scans serving as the use case. An ML model was trained using 21,784 scans from the RSNA Intracranial Hemorrhage CT dataset while generalizability was evaluated using an external validation dataset obtained from our busy trauma and neurosurgical center. This real world external validation dataset consisted of every unenhanced head CT scan (n = 5965) performed in our emergency department in 2019 without exclusion. The model demonstrated an AUC of 98.4%, sensitivity of 98.8%, and specificity of 98.0%, on the test dataset. On external validation, the model demonstrated an AUC of 95.4%, sensitivity of 91.3%, and specificity of 94.1%. Evaluating the ML model using a real world external validation dataset that is temporally and geographically distinct from the training dataset indicates that ML generalizability is achievable in medical imaging applications.

Authors

  • Hojjat Salehinejad
  • Jumpei Kitamura
    Fujisawa, Kanagawa, Japan.
  • Noah Ditkofsky
    Department of Medical Imaging, St. Michael's Hospital, Unity Health Toronto, 30 Bond Street, Toronto, ON, M5B 1W8, Canada.
  • Amy Lin
    University of Illinois Hospital and Health Sciences System, Chicago, IL, USA.
  • Aditya Bharatha
    Li Ka Shing Centre for Healthcare Analytics Research and Training, St. Michael's Hospital, Toronto, Canada.
  • Suradech Suthiphosuwan
    Department of Medical Imaging, St. Michael's Hospital, Unity Health Toronto, 30 Bond Street, Toronto, ON, M5B 1W8, Canada.
  • Hui-Ming Lin
    Li Ka Shing Centre for Healthcare Analytics Research and Training, St. Michael's Hospital, Toronto, Canada.
  • Jefferson R Wilson
    Division of Neurosurgery, University of Toronto, Toronto, Ontario, Canada.
  • Muhammad Mamdani
    Unity Health Toronto (Verma, Murray, Straus, Pou-Prom, Mamdani); Li Ka Shing Knowledge Institute of St. Michael's Hospital (Verma, Straus, Pou-Prom, Mamdani); Department of Medicine (Verma, Shojania, Straus, Mamdani) and Institute of Health Policy, Management, and Evaluation (Verma, Mamdani) and Department of Statistics (Murray), University of Toronto, Toronto, Ont.; University of Alberta (Greiner); Alberta Machine Intelligence Institute (Greiner), Edmonton, Alta.; Montreal Institute for Learning Algorithms (Cohen), Montréal, Que.; Centre for Quality Improvement and Patient Safety (Shojania), University of Toronto; Sunnybrook Health Sciences Centre (Shojania); Vector Institute (Ghassemi, Mamdani) and Department of Computer Science (Ghassemi); Leslie Dan Faculty of Pharmacy (Mamdani), University of Toronto, Toronto, Ont.; Department of Radiology, Stanford University (Cohen), Stanford, Calif. muhammad.mamdani@unityhealth.to amol.verma@mail.utoronto.ca.
  • Errol Colak
    Li Ka Shing Knowledge Institute, St. Michael's Hospital, Unity Health Toronto, Toronto, ON, Canada.