Prosit: proteome-wide prediction of peptide tandem mass spectra by deep learning.

Journal: Nature methods
Published Date:

Abstract

In mass-spectrometry-based proteomics, the identification and quantification of peptides and proteins heavily rely on sequence database searching or spectral library matching. The lack of accurate predictive models for fragment ion intensities impairs the realization of the full potential of these approaches. Here, we extended the ProteomeTools synthetic peptide library to 550,000 tryptic peptides and 21 million high-quality tandem mass spectra. We trained a deep neural network, termed Prosit, resulting in chromatographic retention time and fragment ion intensity predictions that exceed the quality of the experimental data. Integrating Prosit into database search pipelines led to more identifications at >10× lower false discovery rates. We show the general applicability of Prosit by predicting spectra for proteases other than trypsin, generating spectral libraries for data-independent acquisition and improving the analysis of metaproteomes. Prosit is integrated into ProteomicsDB, allowing search result re-scoring and custom spectral library generation for any organism on the basis of peptide sequence alone.

Authors

  • Siegfried Gessulat
    Chair of Proteomics and Bioanalytics, Technical University of Munich, Freising, Germany.
  • Tobias Schmidt
    Jena University Hospital, Jena, Germany.
  • Daniel Paul Zolg
    Chair of Proteomics and Bioanalytics, Technical University of Munich, Freising, Germany.
  • Patroklos Samaras
    Chair of Proteomics and Bioanalytics, Technical University of Munich, Freising, Germany.
  • Karsten Schnatbaum
    JPT Peptide Technologies GmbH, Berlin, Germany.
  • Johannes Zerweck
    JPT Peptide Technologies GmbH, Berlin, Germany.
  • Tobias Knaute
    JPT Peptide Technologies GmbH, Berlin, Germany.
  • Julia Rechenberger
    Chair of Proteomics and Bioanalytics, Technical University of Munich, Freising, Germany.
  • Bernard Delanghe
    Thermo Fisher Scientific, Bremen, Germany.
  • Andreas Huhmer
    Thermo Fisher Scientific, San Jose, CA, USA.
  • Ulf Reimer
    JPT Peptide Technologies GmbH, Berlin, Germany.
  • Hans-Christian Ehrlich
    SAP SE, Potsdam, Germany.
  • Stephan Aiche
    SAP SE, Potsdam, Germany.
  • Bernhard Kuster
    Chair for Proteomics and Bioanalytics, TU Muenchen, Freising 85354, Germany; German Cancer Consortium (DKTK), Munich, Germany; German Cancer Research Center (DKFZ), Heidelberg, Germany; Center for Integrated Protein Science Munich, Munich, Germany; Bavarian Biomolecular Mass Spectrometry Center, Technische Universität München, Freising, Germany.
  • Mathias Wilhelm
    Chair for Proteomics and Bioanalytics, TU Muenchen, Freising 85354, Germany.