MIND-S is a deep-learning prediction model for elucidating protein post-translational modifications in human diseases.

Journal: Cell reports methods
PMID:

Abstract

We present a deep-learning-based platform, MIND-S, for protein post-translational modification (PTM) predictions. MIND-S employs a multi-head attention and graph neural network and assembles a 15-fold ensemble model in a multi-label strategy to enable simultaneous prediction of multiple PTMs with high performance and computation efficiency. MIND-S also features an interpretation module, which provides the relevance of each amino acid for making the predictions and is validated with known motifs. The interpretation module also captures PTM patterns without any supervision. Furthermore, MIND-S enables examination of mutation effects on PTMs. We document a workflow, its applications to 26 types of PTMs of two datasets consisting of ∼50,000 proteins, and an example of MIND-S identifying a PTM-interrupting SNP with validation from biological data. We also include use case analyses of targeted proteins. Taken together, we have demonstrated that MIND-S is accurate, interpretable, and efficient to elucidate PTM-relevant biological processes in health and diseases.

Authors

  • Yu Yan
    School of Preclinical Medicine, Guangxi Medical University, No. 22, Shuangyong Road, Nanning, Guangxi 530021, China.
  • Jyun-Yu Jiang
    Department of Computer Science, University of California, Los Angeles, CA, United States.
  • Mingzhou Fu
    Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA.
  • Ding Wang
  • Alexander R Pelletier
    Department of Biochemistry, Microbiology and Immunology and Ottawa Institute of Systems Biology, Faculty of Medicine, University of Ottawa, 451 Smyth Road, Ottawa, Ontario K1H 8M5, Canada.
  • Dibakar Sigdel
    NIH BD2K Program Centers of Excellence for Big Data Computing-Heart BD2K Center, Departments of Physiology, Medicine/Cardiology, and Bioinformatics, David Geffen School of Medicine, University of California , Los Angeles, California.
  • Dominic C M Ng
    NIH BRIDGE2AI Center at UCLA & NHLBI Integrated Cardiovascular Data Science Training Program at UCLA, Suite 1-609, MRL Building, 675 Charles E. Young Dr. South, Los Angeles, CA 90095-1760, USA.
  • Wei Wang
    State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences, University of Macau, Macau 999078, China.
  • Peipei Ping
    From the NIH BD2K Center of Excellence for Biomedical Computing at UCLA, Los Angeles, CA (P.P., K.W., A.B.); and NIH BD2K KnowEng Center of Excellence for Biomedical Computing at UIUC, Urbana, IL (J.H.). pping38@g.ucla.edu.