iHerd: an integrative hierarchical graph representation learning framework to quantify network changes and prioritize risk genes in disease.

Journal: PLoS computational biology
PMID:

Abstract

Different genes form complex networks within cells to carry out critical cellular functions, while network alterations in this process can potentially introduce downstream transcriptome perturbations and phenotypic variations. Therefore, developing efficient and interpretable methods to quantify network changes and pinpoint driver genes across conditions is crucial. We propose a hierarchical graph representation learning method, called iHerd. Given a set of networks, iHerd first hierarchically generates a series of coarsened sub-graphs in a data-driven manner, representing network modules at different resolutions (e.g., the level of signaling pathways). Then, it sequentially learns low-dimensional node representations at all hierarchical levels via efficient graph embedding. Lastly, iHerd projects separate gene embeddings onto the same latent space in its graph alignment module to calculate a rewiring index for driver gene prioritization. To demonstrate its effectiveness, we applied iHerd on a tumor-to-normal GRN rewiring analysis and cell-type-specific GCN analysis using single-cell multiome data of the brain. We showed that iHerd can effectively pinpoint novel and well-known risk genes in different diseases. Distinct from existing models, iHerd's graph coarsening for hierarchical learning allows us to successfully classify network driver genes into early and late divergent genes (EDGs and LDGs), emphasizing genes with extensive network changes across and within signaling pathway levels. This unique approach for driver gene classification can provide us with deeper molecular insights. The code is freely available at https://github.com/aicb-ZhangLabs/iHerd. All other relevant data are within the manuscript and supporting information files.

Authors

  • Ziheng Duan
    School of Big Data and Software Engineering, Chongqing University, Chongqing, 401331, China; College of Energy Engineering, Zhejiang University, Zhejiang, 310027, China.
  • Yi Dai
    Department of Computer Science, University of California, Irvine, CA 92617, USA.
  • Ahyeon Hwang
    Mathematical, Computational & Systems Biology, University of California, Irvine, CA 92697, USA.
  • Cheyu Lee
    Department of Computer Science, University of California, Irvine, California, United States of America.
  • Kaichi Xie
    Department of Computer Science, University of California, Davis, California, United States of America.
  • Chutong Xiao
    Department of Computer Science, University of California, Irvine, California, United States of America.
  • Min Xu
    Department of Gastroenterology, Shanghai First People's Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, People's Republic of China.
  • Matthew J Girgenti
    Department of Psychiatry, School of Medicine, Yale University, New Haven, CT 06520, USA.
  • Jing Zhang
    MOEMIL Laboratory, School of Optoelectronic Information, University of Electronic Science and Technology of China, Chengdu, China.