ZoomQA: residue-level protein model accuracy estimation with machine learning on sequential and 3D structural features.

Journal: Briefings in bioinformatics
PMID:

Abstract

MOTIVATION: The Estimation of Model Accuracy problem is a cornerstone problem in the field of Bioinformatics. As of CASP14, there are 79 global QA methods, and a minority of 39 residue-level QA methods with very few of them working on protein complexes. Here, we introduce ZoomQA, a novel, single-model method for assessing the accuracy of a tertiary protein structure/complex prediction at residue level, which have many applications such as drug discovery. ZoomQA differs from others by considering the change in chemical and physical features of a fragment structure (a portion of a protein within a radius $r$ of the target amino acid) as the radius of contact increases. Fourteen physical and chemical properties of amino acids are used to build a comprehensive representation of every residue within a protein and grade their placement within the protein as a whole. Moreover, we have shown the potential of ZoomQA to identify problematic regions of the SARS-CoV-2 protein complex.

Authors

  • Kyle Hippe
    Department of Computer Science, Pacific Lutheran University, Tacoma, WA 98447, USA.
  • Cade Lilley
    Department of Computer Science, Pacific Lutheran University, Tacoma, WA 98447, USA.
  • Joshua William Berkenpas
    Department of Computer Science, Pacific Lutheran University, Tacoma, WA 98447, USA.
  • Ciri Chandana Pocha
    Saint Louis University, USA.
  • Kiyomi Kishaba
    Department of Computer Science, Pacific Lutheran University, Tacoma, WA 98447, USA.
  • Hui Ding
    Medical School, Huanghe Science & Technology University, Zhengzhou 450063, PR China.
  • Jie Hou
    Department of Computer Science, University of Missouri, Columbia, MO, 65211, USA.
  • Dong Si
  • Renzhi Cao
    Department of Computer Science, Pacific Lutheran University, Tacoma, WA, 98447, USA.