Using support vector machines to identify protein phosphorylation sites in viruses.

Journal: Journal of molecular graphics & modelling
PMID:

Abstract

Phosphorylation of viral proteins plays important roles in enhancing replication and inhibition of normal host-cell functions. Given its importance in biology, a unique opportunity has arisen to identify viral protein phosphorylation sites. However, experimental methods for identifying phosphorylation sites are resource intensive. Hence, there is significant interest in developing computational methods for reliable prediction of viral phosphorylation sites from amino acid sequences. In this study, a new method based on support vector machine is proposed to identify protein phosphorylation sites in viruses. We apply an encoding scheme based on attribute grouping and position weight amino acid composition to extract physicochemical properties and sequence information of viral proteins around phosphorylation sites. By 10-fold cross-validation, the prediction accuracies for phosphoserine, phosphothreonine and phosphotyrosine with window size of 23 are 88.8%, 95.2% and 97.1%, respectively. Furthermore, compared with the existing methods of Musite and MDD-clustered HMMs, the high sensitivity and accuracy of our presented method demonstrate the predictive effectiveness of the identified phosphorylation sites for viral proteins.

Authors

  • Shu-Yun Huang
    Department of Chemical Engineering, Pingxiang College, Pingxiang 337055, China.
  • Shao-Ping Shi
    Department of Chemistry, Nanchang University, Nanchang 330031, China; Department of Mathematics, Nanchang University, Nanchang 330031, China.
  • Jian-Ding Qiu
    Department of Chemical Engineering, Pingxiang College, Pingxiang 337055, China; Department of Chemistry, Nanchang University, Nanchang 330031, China. Electronic address: jdqiu@ncu.edu.cn.
  • Ming-Chu Liu
    Department of Chemical Engineering, Pingxiang College, Pingxiang 337055, China.