DeepAlgPro: an interpretable deep neural network model for predicting allergenic proteins.

Journal: Briefings in bioinformatics
Published Date:

Abstract

Allergies have become an emerging public health problem worldwide. The most effective way to prevent allergies is to find the causative allergen at the source and avoid re-exposure. However, most of the current computational methods used to identify allergens were based on homology or conventional machine learning methods, which were inefficient and still had room to be improved for the detection of allergens with low homology. In addition, few methods based on deep learning were reported, although deep learning has been successfully applied to several tasks in protein sequence analysis. In the present work, a deep neural network-based model, called DeepAlgPro, was proposed to identify allergens. We showed its great accuracy and applicability to large-scale forecasts by comparing it to other available tools. Additionally, we used ablation experiments to demonstrate the critical importance of the convolutional module in our model. Moreover, further analyses showed that epitope features contributed to model decision-making, thus improving the model's interpretability. Finally, we found that DeepAlgPro was capable of detecting potential new allergens. Overall, DeepAlgPro can serve as powerful software for identifying allergens.

Authors

  • Chun He
    State Key Laboratory of Rice Biology and Breeding & Ministry of Agricultural and Rural Affairs Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Institute of Insect Sciences, Zhejiang University, Hangzhou, China.
  • Xinhai Ye
    College of Computer Science and Technology, Zhejiang University, Hangzhou, China.
  • Yi Yang
    Department of Orthopedics, Orthopedic Research Institute, West China Hospital, Sichuan University, Chengdu, Sichuan, China.
  • Liya Hu
    Department of Intelligent Systems Engineering, Indiana University, Bloomington, IN 47405.
  • Yuxuan Si
    College of Computer Science and Technology, Zhejiang University, Hangzhou, China.
  • Xianxin Zhao
    State Key Laboratory of Rice Biology and Breeding & Ministry of Agricultural and Rural Affairs Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Institute of Insect Sciences, Zhejiang University, Hangzhou, China.
  • Longfei Chen
    College of Electromechanical Engineering, Qingdao University of Science and Technology, Qingdao, China.
  • Qi Fang
    State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin 130022, China; School of Applied Chemistry and Engineering, University of Science and Technology of China, Hefei, Anhui 230026, China.
  • Ying Wei
    School of Information Science and Engineering, Northeastern University, Shenyang 110004, China ; Key Laboratory of Medical Imaging Calculation of the Ministry of Education, Shenyang 110004, China.
  • Fei Wu
    Zhejiang University, 38 Zheda Road, Hangzhou 310058, Zhejiang, China.
  • Gongyin Ye
    State Key Laboratory of Rice Biology and Breeding & Ministry of Agricultural and Rural Affairs Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Institute of Insect Sciences, Zhejiang University, Hangzhou, China.