Exploitation of surrogate variables in random forests for unbiased analysis of mutual impact and importance of features.
Journal:
Bioinformatics (Oxford, England)
Published Date:
Aug 1, 2023
Abstract
MOTIVATION: Random forest is a popular machine learning approach for the analysis of high-dimensional data because it is flexible and provides variable importance measures for the selection of relevant features. However, the complex relationships between the features are usually not considered for the selection and thus also neglected for the characterization of the analysed samples.