Protein contact prediction using metagenome sequence data and residual neural networks.

Journal: Bioinformatics (Oxford, England)
Published Date:

Abstract

MOTIVATION: Almost all protein residue contact prediction methods rely on the availability of deep multiple sequence alignments (MSAs). However, many proteins from the poorly populated families do not have sufficient number of homologs in the conventional UniProt database. Here we aim to solve this issue by exploring the rich sequence data from the metagenome sequencing projects.

Authors

  • Qi Wu
    Endoscopy Center, Peking University Cancer Hospital and Institute, Beijing, China.
  • Zhenling Peng
    Center for Applied Mathematics, Tianjin University, Tianjin, China.
  • Ivan Anishchenko
    Computational Biology Program, The University of Kansas, Lawrence, Kansas.
  • Qian Cong
    Department of Biochemistry, Seattle, WA 98105, USA.
  • David Baker
    Department of Biochemistry, University of Washington, Seattle, Washington.
  • Jianyi Yang
    School of Mathematical Sciences, Nankai University, Tianjin, China.