Mechanistic machine learning: how data assimilation leverages physiologic knowledge using Bayesian inference to forecast the future, infer the present, and phenotype.

Journal: Journal of the American Medical Informatics Association : JAMIA
Published Date:

Abstract

We introduce data assimilation as a computational method that uses machine learning to combine data with human knowledge in the form of mechanistic models in order to forecast future states, to impute missing data from the past by smoothing, and to infer measurable and unmeasurable quantities that represent clinically and scientifically important phenotypes. We demonstrate the advantages it affords in the context of type 2 diabetes by showing how data assimilation can be used to forecast future glucose values, to impute previously missing glucose values, and to infer type 2 diabetes phenotypes. At the heart of data assimilation is the mechanistic model, here an endocrine model. Such models can vary in complexity, contain testable hypotheses about important mechanics that govern the system (eg, nutrition's effect on glucose), and, as such, constrain the model space, allowing for accurate estimation using very little data.

Authors

  • David J Albers
    University of Colorado, Anschutz Medical Campus, Section of Informatics and Data Science, Departments of Pediatrics, Biomedical Engineering, and Biostatistics and Informatics, and Department of Biomedical Informatics, Columbia University.
  • Matthew E Levine
    Department of Computing and Mathematical Sciences, California Institute of Technology.
  • Andrew Stuart
    Department of Computing and Mathematical Sciences, University California Institute of Technology, Pasadena, California, USA.
  • Lena Mamykina
    Department of Biomedical Informatics, Columbia University.
  • Bruce Gluckman
    Department of Engineering Science and Mechanics, Pennsylvania State University, University Park, Pennsylvania, USA.
  • George Hripcsak
    Department of Biomedical Informatics, Columbia University, 622 W 168th Street, PH20, New York, NY 10032, USA; Medical Informatics Services, NewYork-Presbyterian Hospital, 622 W 168th Street, PH20, New York, NY 10032, USA. Electronic address: hripcsak@columbia.edu.