iALP: Identification of Allergenic Proteins Based on Large Language Model and Gate Linear Unit.

Journal: Interdisciplinary sciences, computational life sciences
Published Date:

Abstract

The rising incidence of allergic disorders has emerged as a pressing public health issue worldwide, underscoring the need for intensified research and efficacious intervention measures. Accurate identification of allergenic proteins (ALPs) is essential in preventing allergic reactions and mitigating health risks at an individual level. Although machine learning and deep learning techniques have been widely applied in ALP identification, existing methods often have limitations in capturing their complex features. In response, we introduce a novel method iALP, which leverages a large language model ProtT5 and the gate linear unit (GLU) for ALP identification with high efficacy. The advanced features in ProtT5 enable an in-depth analysis of the complex characteristics of ALPs, while GLU captures the intricate nonlinear features hidden within these proteins. The results demonstrate that iALP achieves an impressive accuracy and F1-score of 0.957 on the test set. Furthermore, it demonstrates superior performance compared to the leading predictors in a separate dataset. We also provide a detailed discussion of the model performance with protein sequences shorter than 100 amino acids. We hope that iALP will facilitate accurate ALP prediction, thereby supporting effective allergy symptom prevention and the implementation of allergen prevention and treatment strategies. The iALP source codes and datasets for prediction tasks can be accessed from the GitHub repository located at https://github.com/xialab-ahu/iALP.git .

Authors

  • Bing Zhang
    School of Information Science and Engineering, Yanshan University, Hebei Avenue, Qinhuangdao, 066004, China.
  • Jianping Zhao
    National Center for Natural Products Research, Research Institute of Pharmaceutical Sciences, University of Mississippi, Oxford, MS, United States.
  • Yannan Bin
    Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui 230601, China.
  • Junfeng Xia

Keywords

No keywords available for this article.