Artificial intelligence for proteomics and biomarker discovery.

Journal: Cell systems
Published Date:

Abstract

There is an avalanche of biomedical data generation and a parallel expansion in computational capabilities to analyze and make sense of these data. Starting with genome sequencing and widely employed deep sequencing technologies, these trends have now taken hold in all omics disciplines and increasingly call for multi-omics integration as well as data interpretation by artificial intelligence technologies. Here, we focus on mass spectrometry (MS)-based proteomics and describe how machine learning and, in particular, deep learning now predicts experimental peptide measurements from amino acid sequences alone. This will dramatically improve the quality and reliability of analytical workflows because experimental results should agree with predictions in a multi-dimensional data landscape. Machine learning has also become central to biomarker discovery from proteomics data, which now starts to outperform existing best-in-class assays. Finally, we discuss model transparency and explainability and data privacy that are required to deploy MS-based biomarkers in clinical settings.

Authors

  • Matthias Mann
    From the ‡Proteomics and Signal Transduction Group and mmann@biochem.mpg.de.
  • Chanchal Kumar
    Translational Science & Experimental Medicine, Research and Early Development, Cardiovascular, Renal and Metabolism (CVRM), BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden. Electronic address: chanchal.kumar@astrazeneca.com.
  • Wen-Feng Zeng
    University of Chinese Academy of Sciences , Beijing, China.
  • Maximilian T Strauss
    Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany.