GraphPhos: Predict Protein-Phosphorylation Sites Based on Graph Neural Networks.

Journal: International journal of molecular sciences
PMID:

Abstract

Phosphorylation is one of the most common protein post-translational modifications. The identification of phosphorylation sites serves as the cornerstone for protein-phosphorylation-related research. This paper proposes a protein-phosphorylation site-prediction model based on graph neural networks named GraphPhos, which combines sequence features with structure features. Sequence features are derived from manual extraction and the calculation of protein pre-trained language models, and the structure feature is the secondary structure contact map calculated from protein tertiary structure. These features are then innovatively applied to graph neural networks. By inputting the features of the entire protein sequence and its contact graph, GraphPhos achieves the goal of predicting phosphorylation sites along the entire protein. Experimental results indicate that GraphPhos improves the accuracy of serine, threonine, and tyrosine site prediction by at least 8%, 15%, and 12%, respectively, exhibiting an average 7% improvement in accuracy compared to individual amino acid category prediction models.

Authors

  • Zeyu Wang
    Department of Neurosurgery, Xiangya Hospital, Central South University, Changsha, China.
  • Xiaoli Yang
    SignalChem Lifesciences Corp., 110-13120 Vanier Place, Richmond, BC, V6V 2J2, Canada.
  • Songye Gao
    College of Computer Science and Technology, Jilin University, Changchun 130012, China.
  • Yanchun Liang
    * College of Computer Science and Technology, Key Laboratory of Symbolic, Computation and Knowledge, Engineering of Ministry of Education, Jilin University, Changchun 130012, P. R. China.
  • Xiaohu Shi
    College of Computer Science and Technology, Jilin University, Qianjing Street 2699, Changchun, Jilin 130012, China.