LEGOLAS: A Machine Learning Method for Rapid and Accurate Predictions of Protein NMR Chemical Shifts.
Journal:
Journal of chemical theory and computation
PMID:
40211504
Abstract
This work introduces LEGOLAS, a fully open source TorchANI-based neural network model designed to predict NMR chemical shifts for protein backbone atoms (N, Cα, Cβ, C', HN, Hα). LEGOLAS has been designed to be fast without loss of accuracy, as our model is able to predict backbone chemical shifts with root-mean-square errors of 2.53 ppm for N, 0.91 ppm for Cα, 1.14 ppm for Cβ, 1.02 ppm for C', 0.49 ppm for amide protons, and 0.27 ppm for Hα. The program predicts chemical shifts an order of magnitude faster than the widely utilized SHIFTX2 model. This breakthrough allows us to predict NMR chemical shifts for a very large number of input structures, such as frames from a molecular dynamics (MD) trajectory. In our simulation of the protein BBL from , we observe that averaging the chemical shift predictions for a set of frames of an MD trajectory substantially improves the agreement with experiment with respect to using a single frame of the dynamics. We also show that LEGOLAS can be successfully applied to the problem of recognizing the native states of a protein among a set of decoys.