Enhancing Machine-Learning Prediction of Enzyme Catalytic Temperature Optima through Amino Acid Conservation Analysis.

Journal: International journal of molecular sciences
PMID:

Abstract

Enzymes play a crucial role in various industrial production and pharmaceutical developments, serving as catalysts for numerous biochemical reactions. Determining the optimal catalytic temperature () of enzymes is crucial for optimizing reaction conditions, enhancing catalytic efficiency, and accelerating the industrial processes. However, due to the limited availability of experimentally determined data and the insufficient accuracy of existing computational methods in predicting , there is an urgent need for a computational approach to predict the values of enzymes accurately. In this study, using phosphatase (EC 3.1.3.X) as an example, we constructed a machine learning model utilizing amino acid frequency and protein molecular weight information as features and employing the K-nearest neighbors regression algorithm to predict the of enzymes. Usually, when conducting engineering for enzyme thermostability, researchers tend not to modify conserved amino acids. Therefore, we utilized this machine learning model to predict the of phosphatase sequences after removing conserved amino acids. We found that the predictive model's mean coefficient of determination (R) value increased from 0.599 to 0.755 compared to the model based on the complete sequences. Subsequently, experimental validation on 10 phosphatase enzymes with undetermined optimal catalytic temperatures shows that the predicted values of most phosphatase enzymes based on the sequence without conservative amino acids are closer to the experimental optimal catalytic temperature values. This study lays the foundation for the rapid selection of enzymes suitable for industrial conditions.

Authors

  • Yinyin Cao
    Cardiovascular Center, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai 201102, China. Electronic address: yinyin19881126@126.com.
  • Boyu Qiu
    Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China.
  • Xiao Ning
    Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China.
  • Lin Fan
    Department of Cardiovascular Surgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China.
  • Yanmei Qin
    Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China.
  • Dong Yu
    State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Hospital/Cancer Institute, Fudan University, Shanghai 200438, China.
  • Chunhe Yang
    College of Biotechnology, Tianjin University of Science and Technology, Tianjin 300457, China.
  • Hongwu Ma
    Biodesign Centre, Key Laboratory of Systems Microbial Technology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, PR China. Electronic address: ma_hw@tib.cas.cn.
  • Xiaoping Liao
    Biodesign Centre, Key Laboratory of Systems Microbial Technology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, PR China.
  • Chun You
    Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, 300308, China. you_c@tib.cas.cn.