Embedding covariate adjustments in tree-based automated machine learning for biomedical big data analyses.

Journal: BMC bioinformatics

Published Date: Oct 1, 2020

Abstract

BACKGROUND: A typical task in bioinformatics consists of identifying which features are associated with a target outcome of interest and building a predictive model. Automated machine learning (AutoML) systems such as the Tree-based Pipeline Optimization Tool (TPOT) constitute an appealing approach to this end. However, in biomedical data, there are often baseline characteristics of the subjects in a study or batch effects that need to be adjusted for in order to better isolate the effects of the features of interest on the target. Thus, the ability to perform covariate adjustments becomes particularly important for applications of AutoML to biomedical big data analysis.

Authors

Elisabetta Manduchi

University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.
Weixuan Fu

Department of Biostatistics, Epidemiology, and Informatics.
Joseph D Romano

Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, 19104, USA.
Stefano Ruberto

Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, 19104, USA.
Jason H Moore

University of Pennsylvania, Philadelphia, PA, USA.

Keywords

Algorithms Automation Big Data Data Analysis Humans Machine Learning

External Resources

View on PubMed Access via DOI PubMed (32998684)

Embedding covariate adjustments in tree-based automated machine learning for biomedical big data analyses.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals