Accurate Predictions of Molecular Properties of Proteins via Graph Neural Networks and Transfer Learning.

Journal: Journal of chemical theory and computation
PMID:

Abstract

Machine learning has emerged as a promising approach for predicting molecular properties of proteins, as it addresses limitations of experimental and traditional computational methods. Here, we introduce GSnet, a graph neural network (GNN) trained to predict physicochemical and geometric properties including solvation-free energies, diffusion constants, and hydrodynamic radii, based on three-dimensional protein structures. By leveraging transfer learning, pretrained GSnet embeddings were adapted to predict solvent-accessible surface area (SASA) and residue-specific p values, achieving high accuracy and generalizability. Notably, GSnet outperformed existing protein embeddings for SASA prediction and a locally charge-aware variant, aLCnet, approached the accuracy of simulation-based and empirical methods for p prediction. Our GNN framework demonstrated robustness across diverse data sets, including intrinsically disordered peptides, and scalability for high-throughput applications. These results highlight the potential of GNN-based embeddings and transfer learning to advance protein structure analysis, providing a foundation for integrating predictive models into proteome-wide studies and structural biology pipelines.

Authors

  • Spencer Wozniak
    Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States.
  • Giacomo Janson
    Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan, USA.
  • Michael Feig
    Department of Biochemistry and Molecular Biology , Michigan State University , East Lansing , Michigan 48824 , United States.