Toward an Integrated Machine Learning Model of a Proteomics Experiment.

Journal: Journal of proteome research
Published Date:

Abstract

In recent years machine learning has made extensive progress in modeling many aspects of mass spectrometry data. We brought together proteomics data generators, repository managers, and machine learning experts in a workshop with the goals to evaluate and explore machine learning applications for realistic modeling of data from multidimensional mass spectrometry-based proteomics analysis of any sample or organism. Following this sample-to-data roadmap helped identify knowledge gaps and define needs. Being able to generate bespoke and realistic synthetic data has legitimate and important uses in system suitability, method development, and algorithm benchmarking, while also posing critical ethical questions. The interdisciplinary nature of the workshop informed discussions of what is currently possible and future opportunities and challenges. In the following perspective we summarize these discussions in the hope of conveying our excitement about the potential of machine learning in proteomics and to inspire future research.

Authors

  • Benjamin A Neely
    National Institute of Standards and Technology, Charleston, South Carolina 29412, United States.
  • Viktoria Dorfer
    University of Applied Sciences Upper Austria, School of Informatics, Communications and Media, Softwarepark 11, 4232 Hagenberg, Austria.
  • Lennart Martens
    VIB-UGent Center for Medical Biotechnology, VIB , Ghent , Belgium.
  • Isabell Bludau
    Department of Biology, Institute of Molecular Systems Biology, ETH Zürich, Zürich, Switzerland.
  • Robbin Bouwmeester
    VIB-UGent Center for Medical Biotechnology, VIB , Ghent , Belgium.
  • Sven Degroeve
    VIB-UGent Center for Medical Biotechnology, VIB , Ghent , Belgium.
  • Eric W Deutsch
    Institute for Systems Biology, Seattle, Washington 98109, United States.
  • Siegfried Gessulat
    Chair of Proteomics and Bioanalytics, Technical University of Munich, Freising, Germany.
  • Lukas Käll
    Science for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology and Health, Royal Institute of Technology─KTH, Box 1031, SE-17121 Solna, Sweden.
  • Pawel Palczynski
    Department of Materials, Imperial College London, London SW7 2AZ, U.K.
  • Samuel H Payne
    Department of Biology, Brigham Young University, Provo, UT 84604, USA.
  • Tobias Greisager Rehfeldt
    Department of Mathematics and Computer Science, University of Southern Denmark, 5230 Odense, Denmark.
  • Tobias Schmidt
    Jena University Hospital, Jena, Germany.
  • Veit Schwämmle
    Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark.
  • Julian Uszkoreit
    Ruhr University Bochum, Faculty of Medicine, Medizinisches Proteom-Center, Bochum, Germany.
  • Juan Antonio Vizcaíno
    European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom.
  • Mathias Wilhelm
    Chair for Proteomics and Bioanalytics, TU Muenchen, Freising 85354, Germany.
  • Magnus Palmblad
    Center for Proteomics and Metabolomics, Leiden University Medical Center, 2300 RC Leiden, The Netherlands.