ResNeXt-Based Rescoring Model for Proteoform Characterization in Top-Down Mass Spectra.

Journal: Interdisciplinary sciences, computational life sciences
Published Date:

Abstract

In top-down proteomics, the accurate identification and characterization of proteoform through mass spectrometry represents a critical objective. As a result, achieving accuracy in identification results is essential. Multiple primary structure alterations in proteins generate a diverse range of proteoforms, resulting in an exponential increase in potential proteoform. Moreover, the absence of a definitive reference set complicates the standardization of results. Therefore, enhancing the accuracy of proteoform characterization continues to be a significant challenge. We introduced a ResNeXt-based deep learning model, PrSMBooster, for rescoring proteoform spectrum matches (PrSM) during proteoform characterization. As an ensemble method, PrSMBooster integrates four machine learning models, logistic regression, XGBoost, decision tree, and support vector machine, as weak learners to obtain PrSM features. The basic and latent features of PrSM are subsequently input into the ResNeXt model for final rescoring. To verify the effect and accuracy of the PrSMBooster model in rescoring proteoform characterization, it was compared with the characterization algorithm TopPIC across 47 independent mass spectrometry datasets from various species. The experimental results indicate that in most mass spectrometry datasets, the number of PrSMs obtained after rescoring with PrSMBooster increases at a false discovery rate (FDR) of 1%. Further analysis of the experimental results confirmed that PrSMBooster improves the accuracy of PrSM scoring, generates more mass spectrometry characterization results, and demonstrates strong generalization ability.

Authors

  • Jiancheng Zhong
    School of Information Science and Engineering, Hunan Normal University, Changsha, China.
  • Yicheng Luo
    College of Information Science and Engineering, Hunan Normal University, Changsha, 410081, China.
  • Chen Yang
    Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, China.
  • Maoqi Yuan
    College of Information Science and Engineering, Hunan Normal University, Changsha, 410081, China.
  • Shaokai Wang
    School of David R. Cheriton School of Computer Science, University of Waterloo, Canada.