[AcidBasePred: a protein acid-base tolerance prediction platform based on deep learning].

Journal: Sheng wu gong cheng xue bao = Chinese journal of biotechnology
PMID:

Abstract

The structures and activities of enzymes are influenced by pH of the environment. Understanding and distinguishing the adaptation mechanisms of enzymes to extreme pH values is of great significance for elucidating the molecular mechanisms and promoting the industrial applications of enzymes. In this study, the ESM-2 protein language model was used to encode the secreted microbial proteins with the optimal performance above pH 9 and below pH 5, which yielded 47 725 high-pH protein sequences and 66 079 low-pH protein sequences, respectively. A deep learning model was constructed to identify protein acid-base tolerance based on amino acid sequences. The model showcased significantly higher accuracy than other methods, with the overall accuracy of 94.8%, precision of 91.8%, and a recall rate of 93.4% on the test set. Furthermore, we built a website (https://enzymepred.biodesign.ac.cn), which enabled users to predict the acid-base tolerance by submitting the protein sequences of enzymes. This study has accelerated the application of enzymes in various fields, including biotechnology, pharmaceuticals, and chemicals. It provides a powerful tool for the rapid screening and optimization of industrial enzymes.

Authors

  • Rong Huang
    School of Nursing, Chuanbei Medical College, Nanchong, China.
  • Hejian Zhang
    School of Biological Engineering, Tianjin University of Science & Technology, Tianjin 300457, China.
  • Min Wu
    Guizhou University of Traditional Chinese Medicine, Guiyang, Guizhou Province, China.
  • Zhiyue Men
    School of Biological Engineering, Tianjin University of Science & Technology, Tianjin 300457, China.
  • Huanyu Chu
    Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China.
  • Jie Bai
    Department of Drug Metabolism, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, No.1 Xiannongtan Street, Beijing 100050, China; Beijing Key Laboratory of Non-Clinical Drug Metabolism and PK/PD Study, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, No.1 Xiannongtan Street, Beijing 100050, China; Beijing Key Laboratory of Active Substances Discovery and Drug Ability Evaluation, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, No.1 Xiannongtan Street, Beijing 100050, China; State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, No.1 Xiannongtan Street, Beijing 100050, China.
  • Hong Chang
  • Jian Cheng
  • Xiaoping Liao
    Biodesign Centre, Key Laboratory of Systems Microbial Technology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, PR China.
  • Yuwan Liu
    Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China.
  • Yajian Song
    School of Biological Engineering, Tianjin University of Science & Technology, Tianjin 300457, China.
  • Huifeng Jiang
    Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China.