A nonparametric Bayesian method of translating machine learning scores to probabilities in clinical decision support.

Journal: BMC bioinformatics
PMID:

Abstract

BACKGROUND: Probabilistic assessments of clinical care are essential for quality care. Yet, machine learning, which supports this care process has been limited to categorical results. To maximize its usefulness, it is important to find novel approaches that calibrate the ML output with a likelihood scale. Current state-of-the-art calibration methods are generally accurate and applicable to many ML models, but improved granularity and accuracy of such methods would increase the information available for clinical decision making. This novel non-parametric Bayesian approach is demonstrated on a variety of data sets, including simulated classifier outputs, biomedical data sets from the University of California, Irvine (UCI) Machine Learning Repository, and a clinical data set built to determine suicide risk from the language of emergency department patients.

Authors

  • Brian Connolly
    Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, University of Cincinnati, Cincinnati, OH, USA.
  • K Bretonnel Cohen
    Computational Bioscience, University of Colorado School of Medicine, Aurora, CO 80045, USA.
  • Daniel Santel
    Department of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, 3333 Burnet Ave., MLC 7024, Cincinnati, OH, 45229-3039, USA.
  • Ulya Bayram
    Department of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, 3333 Burnet Ave., MLC 7024, Cincinnati, OH, 45229-3039, USA.
  • John Pestian
    Department of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, 3333 Burnet Ave., MLC 7024, Cincinnati, OH, 45229-3039, USA. john.pestian@cchmc.org.