Deep graph representations embed network information for robust disease marker identification.

Journal: Bioinformatics (Oxford, England)
Published Date:

Abstract

MOTIVATION: Accurate disease diagnosis and prognosis based on omics data rely on the effective identification of robust prognostic and diagnostic markers that reflect the states of the biological processes underlying the disease pathogenesis and progression. In this article, we present GCNCC, a Graph Convolutional Network-based approach for Clustering and Classification, that can identify highly effective and robust network-based disease markers. Based on a geometric deep learning framework, GCNCC learns deep network representations by integrating gene expression data with protein interaction data to identify highly reproducible markers with consistently accurate prediction performance across independent datasets possibly from different platforms. GCNCC identifies these markers by clustering the nodes in the protein interaction network based on latent similarity measures learned by the deep architecture of a graph convolutional network, followed by a supervised feature selection procedure that extracts clusters that are highly predictive of the disease state.

Authors

  • Omar Maddouri
    Computer, Electrical and Mathematical Sciences & Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal 23955-6900, Kingdom of Saudi Arabia.
  • Xiaoning Qian
    Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, USA.
  • Byung-Jun Yoon
    Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, USA. bjyoon@ece.tamu.edu.