A deep learning model for prediction of lysine crotonylation sites by fusing multi-features based on multi-head self-attention mechanism.
Journal:
Scientific reports
Published Date:
May 29, 2025
Abstract
Lysine crotonylation (Kcr) is an important post-translational modification, which is present in both histone and non-histone proteins, and plays a key role in a variety of biological processes such as metabolism and cell differentiation. Therefore, rapid and accurate identification of this modification has become a key task to study its biological effects. In the past few years, some calculation methods have been developed, but there is room for improvement in prediction performance. In this paper, we propose an effective model named DeepMM-Kcr, which is based on multiple features and an innovative deep learning framework. Multiple features are extracted from natural language processing features and hand-crafted features, where natural language processing features include token embedding and positional embedding encoded by transformer, and hand-crafted features include one-hot, amino acid index and position-weighted amino acid composition, and encoded by bidirectional long short-term memory network. Then natural language processing features and hand-crafted features are fusing by multi-head self-attention mechanism. Finally, a deep learning framework is constructed based on convolutional neural network, bidirectional gated recurrent unit and multilayer perceptron for robust prediction of Kcr sites. On the independent test set, the accuracy of DeepMM-Kcr is highest among the existing models. The experimental results show that our model has very good performance in predicting Kcr sites. The source datasets and codes (in Python) are publicly available at https://github.com/yunyunliang88/DeepMM-Kcr .