BioStructNet: Structure-Based Network with Transfer Learning for Predicting Biocatalyst Functions.

Journal: Journal of chemical theory and computation
PMID:

Abstract

Enzyme-substrate interactions are essential to both biological processes and industrial applications. Advanced machine learning techniques have significantly accelerated biocatalysis research, revolutionizing the prediction of biocatalytic activities and facilitating the discovery of novel biocatalysts. However, the limited availability of data for specific enzyme functions, such as conversion efficiency and stereoselectivity, presents challenges for prediction accuracy. In this study, we developed BioStructNet, a structure-based deep learning network that integrates both protein and ligand structural data to capture the complexity of enzyme-substrate interactions. Benchmarking studies with different algorithms showed the enhanced predictive accuracy of BioStructNet. To further optimize the prediction accuracy for the small data set, we implemented transfer learning in the framework, training a source model on a large data set and fine-tuning it on a small, function-specific data set, using the CalB data set as a case study. The model performance was validated by comparing the attention heat maps generated by the BioStructNet interaction module with the enzyme-substrate interactions revealed from molecular dynamics simulations of enzyme-substrate complexes. BioStructNet would accelerate the discovery of functional enzymes for industrial use, particularly in cases where the training data sets for machine learning are small.

Authors

  • Xiangwen Wang
    College of Computer Science and Engineering, Northwest Normal University, Lanzhou, 730070, People's Republic of China.
  • Jiahui Zhou
    Key Laboratory of Food Quality and Safety of Guangdong Province, College of Food Science, South China Agricultural University, Guangzhou, 510642, China.
  • Jane Mueller
    Department of Biocatalysis and Isotope Chemistry, Almac Sciences, BT63 5QD Craigavon, Northern Ireland, U.K.
  • Derek Quinn
    Department of Biocatalysis and Isotope Chemistry, Almac Sciences, Craigavon BT63 5QD, Northern Ireland, U.K.
  • Alexandra Carvalho
    Department of Biocatalysis and Isotope Chemistry, Almac Sciences, BT63 5QD Craigavon, Northern Ireland, U.K.
  • Thomas S Moody
    Department of Biocatalysis and Isotope Chemistry, Almac Sciences, Craigavon BT63 5QD, Northern Ireland, U.K.
  • Meilan Huang
    School of Chemistry and Chemical Engineering, Queen's University Belfast, David Keir Building, Stranmillis Road, Belfast, Northern Ireland, United Kingdom.