DeepMSA: constructing deep multiple sequence alignment to improve contact prediction and fold-recognition for distant-homology proteins.

Journal: Bioinformatics (Oxford, England)
Published Date:

Abstract

MOTIVATION: The success of genome sequencing techniques has resulted in rapid explosion of protein sequences. Collections of multiple homologous sequences can provide critical information to the modeling of structure and function of unknown proteins. There are however no standard and efficient pipeline available for sensitive multiple sequence alignment (MSA) collection. This is particularly challenging when large whole-genome and metagenome databases are involved.

Authors

  • Chengxin Zhang
    Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan.
  • Wei Zheng
    School of Computer Engineering, Jinling Institute of Technology, Nanjing, 211169, China. zhengwei@jit.edu.cn.
  • S M Mortuza
    Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA.
  • Yang Li
    Occupation of Chinese Center for Disease Control and Prevention, Beijing, China.
  • Yang Zhang
    Innovative Institute of Chinese Medicine and Pharmacy, Academy for Interdiscipline, Chengdu University of Traditional Chinese Medicine, Chengdu, China.