Democratizing EHR analyses with FIDDLE: a flexible data-driven preprocessing pipeline for structured clinical data.
Journal:
Journal of the American Medical Informatics Association : JAMIA
PMID:
33040151
Abstract
OBJECTIVE: In applying machine learning (ML) to electronic health record (EHR) data, many decisions must be made before any ML is applied; such preprocessing requires substantial effort and can be labor-intensive. As the role of ML in health care grows, there is an increasing need for systematic and reproducible preprocessing techniques for EHR data. Thus, we developed FIDDLE (Flexible Data-Driven Pipeline), an open-source framework that streamlines the preprocessing of data extracted from the EHR.