Accurate contact predictions using covariation techniques and machine learning.
Journal:
Proteins
Published Date:
Sep 1, 2016
Abstract
Here we present the results of residue-residue contact predictions achieved in CASP11 by the CONSIP2 server, which is based around our MetaPSICOV contact prediction method. On a set of 40 target domains with a median family size of around 40 effective sequences, our server achieved an average top-L/5 long-range contact precision of 27%. MetaPSICOV method bases on a combination of classical contact prediction features, enhanced with three distinct covariation methods embedded in a two-stage neural network predictor. Some unique features of our approach are (1) the tuning between the classical and covariation features depending on the depth of the input alignment and (2) a hybrid approach to generate deepest possible multiple-sequence alignments by combining jackHMMer and HHblits. We discuss the CONSIP2 pipeline, our results and show that where the method underperformed, the major factor was relying on a fixed set of parameters for the initial sequence alignments and not attempting to perform domain splitting as a preprocessing step. Proteins 2016; 84(Suppl 1):145-151. © 2015 The Authors. Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.
Authors
Keywords
Amino Acid Sequence
Bacteria
Computational Biology
Computer Simulation
Databases, Protein
Humans
Internet
Machine Learning
Models, Molecular
Models, Statistical
Neural Networks, Computer
Protein Folding
Protein Interaction Domains and Motifs
Protein Structure, Secondary
Proteins
Sequence Alignment
Software
Viruses