Accelerated Missense Mutation Identification in Intrinsically Disordered Proteins Using Deep Learning.

Journal: Biomacromolecules
PMID:

Abstract

We use a combination of Brownian dynamics (BD) simulation results and deep learning (DL) strategies for the rapid identification of large structural changes caused by missense mutations in intrinsically disordered proteins (IDPs). We used ∼6500 IDP sequences from MobiDB database of length 20-300 to obtain gyration radii from BD simulation on a coarse-grained single-bead amino acid model (HPS2 model) used by us and others [Dignon, G. L. 2018, 14, e1005941,Tesei, G. 2021, 118, e2111696118,Seth, S. 2024, 160, 014902] to generate the training sets for the DL algorithm. Using the gyration radii ⟨⟩ of the simulated IDPs as the training set, we develop a multilayer perceptron neural net (NN) architecture that predicts the gyration radii of 33 IDPs previously studied by using BD simulation with 97% accuracy from the sequence and the corresponding parameters from the HPS model. We now utilize this NN to predict gyration radii of every permutation of missense mutations in IDPs. Our approach successfully identifies mutation-prone regions that induce significant alterations in the radius of gyration when compared to the wild-type IDP sequence. We further validate the prediction by running BD simulations on the subset of identified mutants. The neural network yields a (10-10)-fold faster computation in the search space for potentially harmful mutations. Our findings have substantial implications for rapid identification and understanding of diseases related to missense mutations in IDPs and for the development of potential therapeutic interventions. The method can be extended to accurate predictions of other mutation effects in disordered proteins.

Authors

  • Swarnadeep Seth
    Department of Physics, University of Central Florida, Orlando, Florida 32816-2385, United States.
  • Aniket Bhattacharya
    Department of Physics, University of Central Florida, Orlando, Florida 32816-2385, United States.