Enhancing the reliability and accuracy of AI-enabled diagnosis via complementarity-driven deferral to clinicians.

Journal: Nature medicine
Published Date:

Abstract

Predictive artificial intelligence (AI) systems based on deep learning have been shown to achieve expert-level identification of diseases in multiple medical imaging settings, but can make errors in cases accurately diagnosed by clinicians and vice versa. We developed Complementarity-Driven Deferral to Clinical Workflow (CoDoC), a system that can learn to decide between the opinion of a predictive AI model and a clinical workflow. CoDoC enhances accuracy relative to clinician-only or AI-only baselines in clinical workflows that screen for breast cancer or tuberculosis (TB). For breast cancer screening, compared to double reading with arbitration in a screening program in the UK, CoDoC reduced false positives by 25% at the same false-negative rate, while achieving a 66% reduction in clinician workload. For TB triaging, compared to standalone AI and clinical workflows, CoDoC achieved a 5-15% reduction in false positives at the same false-negative rate for three of five commercially available predictive AI systems. To facilitate the deployment of CoDoC in novel futuristic clinical settings, we present results showing that CoDoC's performance gains are sustained across several axes of variation (imaging modality, clinical setting and predictive AI system) and discuss the limitations of our evaluation and where further validation would be needed. We provide an open-source implementation to encourage further research and application.

Authors

  • Krishnamurthy Dj Dvijotham
    Google DeepMind, Mountain View, CA, USA. dvij@cs.washington.edu.
  • Jim Winkens
    Google Health, London, UK.
  • Melih Barsbey
    Bogazici University, Istanbul, Turkey.
  • Sumedh Ghaisas
    Google DeepMind, London, UK.
  • Robert Stanforth
    Google DeepMind, London, UK.
  • Nick Pawlowski
  • Patricia Strachan
    Google Research, Mountain View, CA, USA.
  • Zahra Ahmed
    Google DeepMind, London, UK.
  • Shekoofeh Azizi
  • Yoram Bachrach
  • Laura Culp
    Google Research, Mountain View, CA, USA.
  • Mayank Daswani
    Google Research, London, UK.
  • Jan Freyberg
    Google Research, Mountain View, CA, USA.
  • Christopher Kelly
    Google Health, London, UK.
  • Atilla Kiraly
    Google Research, Palo Alto, CA, USA.
  • Timo Kohlberger
    From Google AI Healthcare, Google Research, Mountain View, California (Drs Liu, Kohlberger, Norouzi, Dahl, Peng, Hipp, and Stumpe); and Laboratory Department, Naval Medical Center, San Diego, California (Drs Smith, Mohtashamian, and Olson).
  • Scott McKinney
    OpenAI, San Francisco, CA, USA.
  • Basil Mustafa
    Google Research, Mountain View, CA, USA.
  • Vivek Natarajan
    Google, Mountain View, CA, USA.
  • Krzysztof Geras
    Department of Radiology, NYU Langone Health / NYU Grossman School of Medicine, New York, New York.
  • Jan Witowski
    Department of Radiology, NYU Grossman School of Medicine, New York, NY, USA.
  • Zhi Zhen Qin
    Stop TB Partnership, Geneva, Switzerland.
  • Jacob Creswell
    Stop TB Partnership, Geneva, Switzerland. jacobc@stoptb.org.
  • Shravya Shetty
    Google AI, Mountain View, CA, USA.
  • Marcin Sieniek
    Google Health, Palo Alto, CA, USA.
  • Terry Spitz
    Google Health, London, UK.
  • Greg Corrado
    Google, Mountain View, California (M.H., M.D.H., G.C.).
  • Pushmeet Kohli
    DeepMind, London, UK.
  • Taylan Cemgil
    Google DeepMind, London, UK.
  • Alan Karthikesalingam
    Department of Outcomes Research, St George's Vascular Institute, London, SW17 0QT, United Kingdom.