Random Bits Forest: a Strong Classifier/Regressor for Big Data.

Journal: Scientific reports
Published Date:

Abstract

Efficiency, memory consumption, and robustness are common problems with many popular methods for data analysis. As a solution, we present Random Bits Forest (RBF), a classification and regression algorithm that integrates neural networks (for depth), boosting (for width), and random forests (for prediction accuracy). Through a gradient boosting scheme, it first generates and selects ~10,000 small, 3-layer random neural networks. These networks are then fed into a modified random forest algorithm to obtain predictions. Testing with datasets from the UCI (University of California, Irvine) Machine Learning Repository shows that RBF outperforms other popular methods in both accuracy and robustness, especially with large datasets (N > 1000). The algorithm also performed highly in testing with an independent data set, a real psoriasis genome-wide association study (GWAS).

Authors

  • Yi Wang
    Department of Neurology, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai, China.
  • Yi Li
    Wuhan Zoncare Bio-Medical Electronics Co., Ltd, Wuhan, China.
  • Weilin Pu
    State Key Laboratory of Genetic Engineering, Collaborative Innovation Center for Genetics and Development, School of Life Sciences and Institutes of Biomedical Sciences, Fudan University, Shanghai, China.
  • Kathryn Wen
    Unit on Statistical Genomics, Division of Intramural Division Programs, National Institute of Mental Health, National Institutes of Health, Bethesda, MD, USA.
  • Yin Yao Shugart
    Unit on Statistical Genomics, Division of Intramural Division Programs, National Institute of Mental Health, National Institutes of Health, Bethesda, MD, USA.
  • Momiao Xiong
    School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, United States.
  • Li Jin
    State Key Laboratory of Genetic Engineering and Innovation Center of Genetics and Development, School of Life Sciences, Fudan University, Shanghai, China.