Reinventing gene expression connectivity through regulatory and spatial structural empowerment via principal node aggregation graph neural network.

Journal: Nucleic acids research
PMID:

Abstract

The intricacies of the human genome, manifested as a complex network of genes, transcend conventional representations in text or numerical matrices. The intricate gene-to-gene relationships inherent in this complexity find a more suitable depiction in graph structures. In the pursuit of predicting gene expression, an endeavor shared by predecessors like the L1000 and Enformer methods, we introduce a novel spatial graph-neural network (GNN) approach. This innovative strategy incorporates graph features, encompassing both regulatory and structural elements. The regulatory elements include pair-wise gene correlation, biological pathways, protein-protein interaction networks, and transcription factor regulation. The spatial structural elements include chromosomal distance, histone modification and Hi-C inferred 3D genomic features. Principal Node Aggregation models, validated independently, emerge as frontrunners, demonstrating superior performance compared to traditional regression and other deep learning models. By embracing the spatial GNN paradigm, our method significantly advances the description of the intricate network of gene interactions, surpassing the performance, predictable scope, and initial requirements set by previous methods.

Authors

  • Fengyao Yan
    Department of Public Health and Sciences, University of Miami, Miami, FL 33126, USA.
  • Limin Jiang
    School of Computer Science and Technology, Tianjin University, Tianjin 300350, China; School of Information and Electrical Engineering, Hebei University of Engineering, Handan 056038, China.
  • Danqian Chen
    Key Laboratory of Resource Biology and Biotechnology in Western China, School of Life Sciences, Ministry of Education, Northwest University, No. 229 Taibai North Road, Xi'an, 710069, Shaanxi, China.
  • Michele Ceccarelli
    Computational Biology-Genomic Research Center, ABBVIE, Redwood City, CA, USA. michele.ceccarelli@unina.it.
  • Yan Guo
    State Key Laboratory of Pathogen and Biosecurity, Beijing 100071, China.