A graph-embedded deep feedforward network for disease outcome classification and feature selection using gene expression data.

Journal: Bioinformatics (Oxford, England)
Published Date:

Abstract

MOTIVATION: Gene expression data represents a unique challenge in predictive model building, because of the small number of samples (n) compared with the huge amount of features (p). This 'n≪p' property has hampered application of deep learning techniques for disease outcome classification. Sparse learning by incorporating external gene network information could be a potential solution to this issue. Still, the problem is very challenging because (i) there are tens of thousands of features and only hundreds of training samples, (ii) the scale-free structure of the gene network is unfriendly to the setup of convolutional neural networks.

Authors

  • Yunchuan Kong
    Department of Biostatistics and Bioinformatics, Emory University, 1518 Clifton Rd, Atlanta, GA, 30322, USA.
  • Tianwei Yu
    Department of Biostatistics and Bioinformatics, Emory University, 1518 Clifton Rd, Atlanta, GA, 30322, USA. tianwei.yu@emory.edu.