Predicting glycan structure from tandem mass spectrometry via deep learning.

Journal: Nature methods
PMID:

Abstract

Glycans constitute the most complicated post-translational modification, modulating protein activity in health and disease. However, structural annotation from tandem mass spectrometry (MS/MS) data is a bottleneck in glycomics, preventing high-throughput endeavors and relegating glycomics to a few experts. Trained on a newly curated set of 500,000 annotated MS/MS spectra, here we present CandyCrunch, a dilated residual neural network predicting glycan structure from raw liquid chromatography-MS/MS data in seconds (top-1 accuracy: 90.3%). We developed an open-access Python-based workflow of raw data conversion and prediction, followed by automated curation and fragment annotation, with predictions recapitulating and extending expert annotation. We demonstrate that this can be used for de novo annotation, diagnostic fragment identification and high-throughput glycomics. For maximum impact, this entire pipeline is tightly interlaced with our glycowork platform and can be easily tested at https://colab.research.google.com/github/BojarLab/CandyCrunch/blob/main/CandyCrunch.ipynb . We envision CandyCrunch to democratize structural glycomics and the elucidation of biological roles of glycans.

Authors

  • James Urban
    Department of Chemistry and Molecular Biology, University of Gothenburg, Gothenburg, Sweden.
  • Chunsheng Jin
    Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden.
  • Kristina A Thomsson
    Proteomics Core Facility at Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden.
  • Niclas G Karlsson
    Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden.
  • Callum M Ives
    Department of Chemistry and Hamilton Institute, Maynooth University, Maynooth, Ireland.
  • Elisa Fadda
    Department of Chemistry and Hamilton Institute, Maynooth University, Maynooth W23 F2H6, Ireland.
  • Daniel Bojar
    Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA; Department of Biological Engineering and Institute for Medical Engineering & Science, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.