A hybrid variational autoencoder and WGAN with gradient penalty for tertiary protein structure generation.

Journal: Scientific reports
PMID:

Abstract

Elucidating the tertiary structure of proteins is important for understanding their functions and interactions. While deep neural networks have advanced the prediction of a protein's native structure from its amino acid sequence, the focus on a single-structure view limits understanding of the dynamic nature of protein molecules. Acquiring a multi-structure view of protein molecules remains a broader challenge in computational structural biology. Alternative representations, such as distance matrices, offer a compact and effective way to explore and generate realistic tertiary protein structures. This paper presents TP-VWGAN, a hybrid model to improve the realism of generating distance matrix representations of tertiary protein structures. The model integrates the probabilistic representation learning of the Variational Autoencoder (VAE) with the realistic data generation strength of the Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP). The main modification of TP-VWGAN is incorporating residual blocks into its VAE architecture to improve its performance. The experimental results show that TP-VWGAN with and without residual blocks outperforms existing methods in generating realistic protein structures, but incorporating residual blocks enhances its ability to capture key structural features. Comparisons also demonstrate that the more accurately a model learns symmetry features in the generated distance matrices, the better it captures key structural features, as demonstrated through benchmarking against existing methods. This work moves us closer to more advanced deep generative models that can explore a broader range of protein structures and be applied to drug design and protein engineering. The code and data are available at https://github.com/aalaa-sehsah/tp-vwgan .

Authors

  • Aalaa I Sehsah
    Department of Computer Science, Faculty of Computers and Information, Kafrelsheikh University, Kafr El Sheikh, 33516, Egypt. aalaa.sehsah@fci.kfs.edu.eg.
  • Afaf Mousa
    Department of Computer Science, Faculty of Computers and Information, Menoufia University, Shebin El Kom, 32511, Egypt.
  • Gamal Farouk
    Department of Computer Science, Faculty of Computers and Information, Menoufia University, Shebin El Kom, 32511, Egypt.