A novel sequence alignment algorithm based on deep learning of the protein folding code.

Journal: Bioinformatics (Oxford, England)
Published Date:

Abstract

MOTIVATION: From evolutionary interference, function annotation to structural prediction, protein sequence comparison has provided crucial biological insights. While many sequence alignment algorithms have been developed, existing approaches often cannot detect hidden structural relationships in the 'twilight zone' of low sequence identity. To address this critical problem, we introduce a computational algorithm that performs protein Sequence Alignments from deep-Learning of Structural Alignments (SAdLSA, silent 'd'). The key idea is to implicitly learn the protein folding code from many thousands of structural alignments using experimentally determined protein structures.

Authors

  • Mu Gao
    Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, 30332, USA.
  • Jeffrey Skolnick
    School of Biology, Georgia Institute of Technology, Atlanta, GA 30332, USA.