PhosSight: a Unified Deep Learning Framework Boosting and Accelerating Phosphoproteome Identification to Enable Biological Discoveries

Journal: bioRxiv
Published Date:

Abstract

Protein phosphorylation is a key regulator of signaling, with mass spectrometry (MS) based phosphoproteomics serving as the premier technology for its analysis. However, phosphorylation profiling is hindered by acquisition biases: Data-Dependent Acquisition (DDA) suffers from stochastic undersampling and missing values, while Data-Independent Acquisition (DIA) faces computational bottlenecks and inefficiencies from vast spectral libraries. We present PhosSight, a unified deep learning framework designed to augment identification depth and accelerate search efficiency. PhosSight features PhosDetect, a model that explicitly encodes phosphorylation specific physicochemical features to accurately predict peptide detectability. For DDA, PhosSight leverages predicted retention time, fragment intensity, and detectability to refine site localization and rescoring, recovering marginal, low-abundance spectra. For DIA, PhosSight utilizes detectability-guided library pruning to remove non-detectable noise, accelerating search speeds without compromising sensitivity. Benchmarking on synthetic and real world datasets confirms PhosSight's superior performance in both modes. Applying PhosSight to a large-scale Uterine Corpus Endometrial Carcinoma (UCEC) cohort significantly reduced missing values and expanded the quantifiable phosphoproteome. This enhanced completeness enabled the discovery of novel prognosis associated kinase targets, such as MARK2, underscoring PhosSight as a powerful tool for biological discovery in precision oncology.

Authors

  • Wang
  • B.; Cheng
  • Z.; She
  • C.; Zhang
  • J.; Lv
  • L.; Zhu
  • H.; Liu
  • L.; Fu
  • Y.; Yi
  • X.

Categories