Block Forests: random forests for blocks of clinical and omics covariate data.

Journal: BMC bioinformatics

Published Date: Jun 27, 2019

Abstract

BACKGROUND: In the last years more and more multi-omics data are becoming available, that is, data featuring measurements of several types of omics data for each patient. Using multi-omics data as covariate data in outcome prediction is both promising and challenging due to the complex structure of such data. Random forest is a prediction method known for its ability to render complex dependency patterns between the outcome and the covariates. Against this background we developed five candidate random forest variants tailored to multi-omics covariate data. These variants modify the split point selection of random forest to incorporate the block structure of multi-omics data and can be applied to any outcome type for which a random forest variant exists, such as categorical, continuous and survival outcomes. Using 20 publicly available multi-omics data sets with survival outcome we compared the prediction performances of the block forest variants with alternatives. We also considered the common special case of having clinical covariates and measurements of a single omics data type available.

Authors

Roman Hornung

Institute for Medical Information Processing, Biometry and Epidemiology, University of Munich, Marchioninistr. 15, Munich, 81377, Germany. hornung@ibe.med.uni-muenchen.de.
Marvin N Wright

Institut für Medizinische Biometrie und Statistik, Lübeck, Germany.

Keywords

Genomics Humans Machine Learning Survival Analysis

External Resources

View on PubMed Access via DOI PubMed (31248362)

Block Forests: random forests for blocks of clinical and omics covariate data.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals