ActiMut-XGB: Predicting thermodynamic stability of point mutations for CALB with protein language model.
Journal:
International journal of biological macromolecules
Published Date:
Jun 1, 2025
Abstract
Predicting the functional impact of single-point mutations on protein residual activity, especially after high-temperature incubation, is critical in protein engineering. We present an innovative machine learning model based on eXtreme Gradient Boosting that leverages protein sequence data to predict thermostability, circumventing the need for three-dimensional structural information. Our model integrates features from the ESM2 language model, physicochemical properties, evolutionary features, and positional features. A key advancement is the use of transfer learning with thermal stability data from various proteins, which enhances prediction accuracy and generalizability. To fine-tune and validate the model, we used experimental data from Candida antarctica lipase B single-point mutants, a widely studied enzyme in biocatalysis and industrial applications. Despite potential limitations of Gibbs free energy values in capturing all factors influencing thermostability, our model represents a significant improvement over traditional approaches, providing valuable insights for protein engineering, enzyme optimization, and therapeutic protein development.