SVM-Prot 2016: A Web-Server for Machine Learning Prediction of Protein Functional Families from Sequence Irrespective of Similarity.

Journal: PloS one
Published Date:

Abstract

Knowledge of protein function is important for biological, medical and therapeutic studies, but many proteins are still unknown in function. There is a need for more improved functional prediction methods. Our SVM-Prot web-server employed a machine learning method for predicting protein functional families from protein sequences irrespective of similarity, which complemented those similarity-based and other methods in predicting diverse classes of proteins including the distantly-related proteins and homologous proteins of different functions. Since its publication in 2003, we made major improvements to SVM-Prot with (1) expanded coverage from 54 to 192 functional families, (2) more diverse protein descriptors protein representation, (3) improved predictive performances due to the use of more enriched training datasets and more variety of protein descriptors, (4) newly integrated BLAST analysis option for assessing proteins in the SVM-Prot predicted functional families that were similar in sequence to a query protein, and (5) newly added batch submission option for supporting the classification of multiple proteins. Moreover, 2 more machine learning approaches, K nearest neighbor and probabilistic neural networks, were added for facilitating collective assessment of protein functions by multiple methods. SVM-Prot can be accessed at http://bidd2.nus.edu.sg/cgi-bin/svmprot/svmprot.cgi.

Authors

  • Ying Hong Li
    Innovative Drug Research and Bioinformatics Group, Innovative Drug Research Centre and School of Pharmaceutical Sciences, Chongqing University, Chongqing, 401331, China.
  • Jing Yu Xu
    Innovative Drug Research and Bioinformatics Group, Innovative Drug Research Centre and School of Pharmaceutical Sciences, Chongqing University, Chongqing, 401331, China.
  • Lin Tao
    Innovative Drug Research and Bioinformatics Group, Innovative Drug Research Centre and School of Pharmaceutical Sciences, Chongqing University, Chongqing, 401331, China.
  • Xiao Feng Li
    Innovative Drug Research and Bioinformatics Group, Innovative Drug Research Centre and School of Pharmaceutical Sciences, Chongqing University, Chongqing, 401331, China.
  • Shuang Li
    Clinical and Research Center for Infectious Diseases, Beijing Youan Hospital, Capital Medical University, Beijing, China.
  • Xian Zeng
    The College of Biomedical Engineering and Instrument Science, Zhejiang University, 310027 Hangzhou, Zhejiang, China.
  • Shang Ying Chen
    Bioinformatics and Drug Discovery group, Department of Pharmacy, National University of Singapore, Singapore, 117543, Singapore.
  • Peng Zhang
    Key Laboratory of Macromolecular Science of Shaanxi Province, School of Chemistry & Chemical Engineering, Shaanxi Normal University, Xi'an, Shaanxi 710062, China.
  • Chu Qin
    Bioinformatics and Drug Discovery group, Department of Pharmacy, National University of Singapore, Singapore, 117543, Singapore.
  • Cheng Zhang
    College of Forestry, Jiangxi Agricultural University, Nanchang, Jiangxi Province, China.
  • Zhe Chen
    Evidence-based Medicine Center, Tianjin University of Traditional Chinese Medicine, Tianjin, China.
  • Feng Zhu
    Department of Critical Care Medicine, Shanghai East Hospital, Tongji University School of Medicine, Shanghai, 200120, People's Republic of China.
  • Yu Zong Chen
    Bioinformatics and Drug Discovery group, Department of Pharmacy, National University of Singapore, Singapore, 117543, Singapore.