AutoDC: an automatic machine learning framework for disease classification.

Journal: Bioinformatics (Oxford, England)
Published Date:

Abstract

MOTIVATION: The emergence of next-generation sequencing techniques opens up tremendous opportunities for researchers to uncover the basic mechanisms of disease at the molecular level. Recently, automatic machine learning (AutoML) frameworks have been employed for genomic and epigenomic data analysis. However, to analyze those high-dimensional data, existing AutoML frameworks suffer from the following issues: (i) they could not effectively filter out the redundant features from the original data, and (ii) they usually obey the rule of feature engineering first and algorithm hyper-parameter tuning later to build the machine learning pipeline, which could lead to sub-optimal outcomes. Thus, it is an urgent need to design a new AutoML framework for high-dimensional omics data analysis.

Authors

  • Yang Bai
    Key Laboratory of Digital Medical Engineering of Hebei Province, College of Electronic and Information Engineering, Hebei University, Baoding 071000, Hebei, China.
  • Yang Li
    Occupation of Chinese Center for Disease Control and Prevention, Beijing, China.
  • Yu Shen
    Key Laboratory of Flexible Electronics (KLOFE) & Institute of Advanced Materials (IAM) Nanjing Tech University (NanjingTech) 30 South Puzhu Road Nanjing 211816 P. R. China.
  • Mingyu Yang
    Department of Orthopedics, Orthopedic Center of Chinese PLA, Southwest Hospital, Third Military Medical University, Chongqing, 400038, P.R.China.
  • Wentao Zhang
    Department of Sports Medicine and Rehabilitation, Peking University Shenzhen Hospital, Shenzhen 518036, China.
  • Bin Cui
    Department of Radiology, Aerospace Center Hospital, Beijing 100049, China.