Expert-augmented machine learning.

Journal: Proceedings of the National Academy of Sciences of the United States of America

PMID: 32071251

Abstract

Machine learning is proving invaluable across disciplines. However, its success is often limited by the quality and quantity of available data, while its adoption is limited by the level of trust afforded by given models. Human vs. machine performance is commonly compared empirically to decide whether a certain task should be performed by a computer or an expert. In reality, the optimal learning strategy may involve combining the complementary strengths of humans and machines. Here, we present expert-augmented machine learning (EAML), an automated method that guides the extraction of expert knowledge and its integration into machine-learned models. We used a large dataset of intensive-care patient data to derive 126 decision rules that predict hospital mortality. Using an online platform, we asked 15 clinicians to assess the relative risk of the subpopulation defined by each rule compared to the total sample. We compared the clinician-assessed risk to the empirical risk and found that, while clinicians agreed with the data in most cases, there were notable exceptions where they overestimated or underestimated the true risk. Studying the rules with greatest disagreement, we identified problems with the training data, including one miscoded variable and one hidden confounder. Filtering the rules based on the extent of disagreement between clinician-assessed risk and empirical risk, we improved performance on out-of-sample data and were able to train with less data. EAML provides a platform for automated creation of problem-specific priors, which help build robust and dependable machine-learning models in critical applications.

Authors

Efstathios D Gennatas

Department of Radiation Oncology, University of California, San Francisco, CA 94115.
Jerome H Friedman

Department of Statistics, Stanford University, Stanford, CA 94305 gilmer.valdes@ucsf.edu jose.luna@pennmedicine.upenn.edu jhf@stanford.edu.
Lyle H Ungar

Department of Computer & Information Science, University of Pennsylvania.
Romain Pirracchio
Eric Eaton

Department of Computing and Information Science, University of Pennsylvania, Philadelphia, PA 19104.
Lara G Reichmann

Data Institute, University of San Francisco, CA 94105.
Yannet Interian

Data Analytic Program, University of San Francisco, San Francisco, California.
José Marcio Luna

Mallinckrodt Institute of Radiology, Washington University School of Medicine in St. Louis, St. Louis, MO 63110, United States.
Charles B Simone

Department of Radiation Oncology, University of Maryland Medical Center.
Andrew Auerbach

Division of Hospital Medicine, University of California, San Francisco, CA 94143.
Elier Delgado

Innova Montreal, Inc., Montreal, QC J4W 2P2, Canada.
Mark J van der Laan

Division of Biostatistics, School of Public Health, University of California, Berkeley, CA, USA.
Timothy D Solberg

U.S. Food and Drug Administration, Silver Spring, Maryland.
Gilmer Valdes

Department of Radiation Oncology, University of California, San Francisco, California.

Keywords

Data Management Database Management Systems Expert Systems Machine Learning Medical Informatics

External Resources

View on PubMed Access via DOI PubMed (32071251)

Expert-augmented machine learning.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals