DeepKlapred: A deep learning framework for identifying protein lysine lactylation sites via multi-view feature fusion.

Journal: International journal of biological macromolecules
PMID:

Abstract

Lysine lactylation (Kla) is a post-translational modification (PTM) that holds significant importance in the regulation of various biological processes. While traditional experimental methods are highly accurate for identifying Kla sites, they are both time-consuming and labor-intensive. Recent machine learning advances have enabled computational models for Kla site prediction. In this study, we propose a novel framework that integrates sequence embedding with sequence descriptors to enhance the representation of protein sequence features. Our framework employs a BiGRU-Transformer architecture to capture both local and global dependencies within the sequence, while incorporating six sequence descriptors to extract biochemical properties and evolutionary patterns. Additionally, we apply a cross-attention fusion mechanism to combine sequence embeddings with descriptor-based features, enabling the model to capture complex interactions between different feature representations. Our model demonstrated excellent performance in predicting Kla sites, achieving an accuracy of 0.998 on the training set and 0.969 on the independent set. Additionally, through attention analysis and motif discovery, our model provided valuable insights into key sequence patterns and regions that are crucial for Kla modification. This work not only deepens the understanding of Kla's functional roles but also holds the potential to positively impact future research in protein modification prediction and functional annotation.

Authors

  • Jiahui Guan
    Nvidia, Boston, United States.
  • Peilin Xie
    Kobilka Institute of Innovative Drug Discovery, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China.
  • Danhong Dong
    School of Medicine, The Chinese University of Hong Kong, 2001 Longxiang Road, 518172 Shenzhen, China.
  • Qianchen Liu
    School of Medicine, The Chinese University of Hong Kong, 2001 Longxiang Road, 518172 Shenzhen, China.
  • Zhihao Zhao
    Kobilka Institute of Innovative Drug Discovery, School of Medicine, The Chinese University of Hong Kong, 2001 Longxiang Road, 518172 Shenzhen, China.
  • Yilin Guo
    School of Medicine, The Chinese University of Hong Kong, 2001 Longxiang Road, 518172 Shenzhen, China.
  • Yilun Zhang
    School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China.
  • Tzong-Yi Lee
  • Lantian Yao
    Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, PR China, and also in the School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, PR China.
  • Ying-Chih Chiang
    Kobilka Institute of Innovative Drug Discovery, School of Medicine, The Chinese University of Hong Kong, Shenzhen, China.