Protein Design with Deep Learning.

Journal: International journal of molecular sciences
Published Date:

Abstract

Computational Protein Design (CPD) has produced impressive results for engineering new proteins, resulting in a wide variety of applications. In the past few years, various efforts have aimed at replacing or improving existing design methods using Deep Learning technology to leverage the amount of publicly available protein data. Deep Learning (DL) is a very powerful tool to extract patterns from raw data, provided that data are formatted as mathematical objects and the architecture processing them is well suited to the targeted problem. In the case of protein data, specific representations are needed for both the amino acid sequence and the protein structure in order to capture respectively 1D and 3D information. As no consensus has been reached about the most suitable representations, this review describes the representations used so far, discusses their strengths and weaknesses, and details their associated DL architecture for design and related tasks.

Authors

  • Marianne Defresne
    Toulouse Biotechnology Institute, Université de Toulouse, CNRS, INRAE, INSA, ANITI, 31077 Toulouse, France.
  • Sophie Barbe
    Laboratoire d'Ingénierie des Systèmes Biologiques et des Procédés, Université de Toulouse, CNRS, INRA, INSA, Toulouse, France.
  • Thomas Schiex
    Unité de Mathématiques et Informatique Appliquées de Toulouse, INRA, Castanet Tolosan cedex, France.