Democratizing EHR analyses with FIDDLE: a flexible data-driven preprocessing pipeline for structured clinical data.

Journal: Journal of the American Medical Informatics Association : JAMIA
PMID:

Abstract

OBJECTIVE: In applying machine learning (ML) to electronic health record (EHR) data, many decisions must be made before any ML is applied; such preprocessing requires substantial effort and can be labor-intensive. As the role of ML in health care grows, there is an increasing need for systematic and reproducible preprocessing techniques for EHR data. Thus, we developed FIDDLE (Flexible Data-Driven Pipeline), an open-source framework that streamlines the preprocessing of data extracted from the EHR.

Authors

  • Shengpu Tang
    Division of Computer Science and Engineering, Department of Electronic Engineering and Computer Science, University of Michigan, Ann Arbor, MI.
  • Parmida Davarmanesh
    Department of Mathematics, University of Michigan, Ann Arbor, USA.
  • Yanmeng Song
    Department of Statistics, University of Michigan, Ann Arbor, USA.
  • Danai Koutra
    Department of Electrical Engineering and Computer Science, Division of Computer Science and Engineering, University of Michigan, Ann Arbor, USA.
  • Michael W Sjoding
    1 Department of Internal Medicine, and.
  • Jenna Wiens
    Computer Science and Engineering, University of Michigan, Ann Arbor.