Variability analysis of LC-MS experimental factors and their impact on machine learning.

Journal: GigaScience
Published Date:

Abstract

BACKGROUND: Machine learning (ML) technologies, especially deep learning (DL), have gained increasing attention in predictive mass spectrometry (MS) for enhancing the data-processing pipeline from raw data analysis to end-user predictions and rescoring. ML models need large-scale datasets for training and repurposing, which can be obtained from a range of public data repositories. However, applying ML to public MS datasets on larger scales is challenging, as they vary widely in terms of data acquisition methods, biological systems, and experimental designs.

Authors

  • Tobias Greisager Rehfeldt
    Department of Mathematics and Computer Science, University of Southern Denmark, 5230 Odense, Denmark.
  • Konrad Krawczyk
    NaturalAntibody.
  • Simon Gregersen Echers
    Department of Chemistry and Bioscience, Aalborg University, 9220 Aalborg, Denmark.
  • Paolo Marcatili
    Department of Bio and Health Informatics, Technical University of Denmark, Kongens Lyngby, Denmark.
  • Pawel Palczynski
    Department of Materials, Imperial College London, London SW7 2AZ, U.K.
  • Richard Röttger
    Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark.
  • Veit Schwämmle
    Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark.