AI-clinician collaboration via disagreement prediction: A decision pipeline and retrospective analysis of real-world radiologist-AI interactions.

Journal: Cell reports. Medicine
PMID:

Abstract

Clinical decision support tools can improve diagnostic performance or reduce variability, but they are also subject to post-deployment underperformance. Although using AI in an assistive setting offsets many concerns with autonomous AI in medicine, systems that present all predictions equivalently fail to protect against key AI safety concerns. We design a decision pipeline that supports the diagnostic model with an ecosystem of models, integrating disagreement prediction, clinical significance categorization, and prediction quality modeling to guide prediction presentation. We characterize disagreement using data from a deployed chest X-ray interpretation aid and compare clinician burden in this proposed pipeline to the diagnostic model in isolation. The average disagreement rate is 6.5%, and the expected burden reduction is 4.8%, even if 5% of disagreements on urgent findings receive a second read. We conclude that, in our production setting, we can adequately balance risk mitigation with clinician burden if disagreement false positives are reduced.

Authors

  • Morgan Sanchez
    Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA. Electronic address: morgansanchez@g.harvard.edu.
  • Kyle Alford
    Department of Computer Science, Columbia University, New York, NY 10027, USA.
  • Viswesh Krishna
    National Public School Indiranagar, Bengaluru, Karnataka, India.
  • Thanh M Huynh
    VinBrain JSC, Hanoi 11622, Vietnam.
  • Chanh D T Nguyen
    VinBrain JSC, Hanoi 11622, Vietnam; VinUniversity, Hanoi 12450, Vietnam.
  • Matthew P Lungren
  • Steven Q H Truong
    VinBrain, Hanoi, Vietnam.
  • Pranav Rajpurkar
    Harvard Medical School, Department of Biomedical Informatics, Cambridge, MA, 02115, US.