DeepSecMS Advances DIA-Based Selenoproteome Profiling Through Cys-to-Sec Proxy Training.
Journal:
Advanced science (Weinheim, Baden-Wurttemberg, Germany)
Published Date:
Jul 22, 2025
Abstract
Selenoproteins, defined as proteins containing the 21st amino acid, selenocysteine (Sec, U), are functionally important but rare, with only 25 selenoproteins characterized in the entire human proteome to date. To comprehensively analyze selenoproteomes, previously developed selenocysteine-specific mass spectrometry (SecMS) and the selenocysteine insertion sequence (SECIS)-independent selenoprotein database (SIS) have provided effective tools for analyzing the selenoproteome and, more importantly, hold the potential to uncover new selenoproteins. In this study, a deep learning approach is employed to develop the DeepSecMS method. Given the rarity of Sec and its chemical similarity to cysteine (Cys, C), a proxy training strategy is utilized using a large dataset of Cys-containing peptides to generate a large-scale theoretical library of Sec-containing peptides. It is shown that DeepSecMS enables the accurate prediction of critical features of Sec-containing peptides, including MS2, retention time (RT), and ion mobility (IM). By integrating DeepSecMS with data-independent acquisition (DIA) methods, the identification of known selenoproteins is significantly enhanced across diverse cell types and tissues. More importantly, it facilitates the identification of numerous highly scored, potential novel selenoproteins. These findings highlight the powerful potential of DeepSecMS in advancing selenoprotein research. Moreover, the proxy training strategy may be extended to the analysis of other rare post-translational modifications.
Authors
Keywords
No keywords available for this article.