Sequence-Based Prediction of Protein-Carbohydrate Binding Sites Using Support Vector Machines.

Journal: Journal of chemical information and modeling
Published Date:

Abstract

Carbohydrate-binding proteins play significant roles in many diseases including cancer. Here, we established a machine-learning-based method (called sequence-based prediction of residue-level interaction sites of carbohydrates, SPRINT-CBH) to predict carbohydrate-binding sites in proteins using support vector machines (SVMs). We found that integrating evolution-derived sequence profiles with additional information on sequence and predicted solvent accessible surface area leads to a reasonably accurate, robust, and predictive method, with area under receiver operating characteristic curve (AUC) of 0.78 and 0.77 and Matthew's correlation coefficient of 0.34 and 0.29, respectively for 10-fold cross validation and independent test without balancing binding and nonbinding residues. The quality of the method is further demonstrated by having statistically significantly more binding residues predicted for carbohydrate-binding proteins than presumptive nonbinding proteins in the human proteome, and by the bias of rare alleles toward predicted carbohydrate-binding sites for nonsynonymous mutations from the 1000 genome project. SPRINT-CBH is available as an online server at http://sparks-lab.org/server/SPRINT-CBH .

Authors

  • Ghazaleh Taherzadeh
    School of Information and Communication Technology, Griffith University, Parklands Drive, Southport, Queensland, 4215, Australia.
  • Yaoqi Zhou
    Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, Guangdong, 518106, China. Electronic address: zhouyq@szbl.ac.cn.
  • Alan Wee-Chung Liew
    School of Information and Communication Technology, Griffith University, Parklands Drive, Southport, Queensland, 4215, Australia.
  • Yuedong Yang
    Institute for Glycomics and School of Information and Communication Technique, Griffith University, Parklands Dr. Southport, QLD 4222, Australia.