A sequence-based multiple kernel model for identifying DNA-binding proteins.

Journal: BMC bioinformatics
Published Date:

Abstract

BACKGROUND: DNA-Binding Proteins (DBP) plays a pivotal role in biological system. A mounting number of researchers are studying the mechanism and detection methods. To detect DBP, the tradition experimental method is time-consuming and resource-consuming. In recent years, Machine Learning methods have been used to detect DBP. However, it is difficult to adequately describe the information of proteins in predicting DNA-binding proteins. In this study, we extract six features from protein sequence and use Multiple Kernel Learning-based on Centered Kernel Alignment to integrate these features. The integrated feature is fed into Support Vector Machine to build predictive model and detect new DBP.

Authors

  • Yuqing Qian
    School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, China.
  • Limin Jiang
    School of Computer Science and Technology, Tianjin University, Tianjin 300350, China; School of Information and Electrical Engineering, Hebei University of Engineering, Handan 056038, China.
  • Yijie Ding
    School of Computer Science and Technology, Tianjin University, Tianjin 300350, China. wuxi_dyj@tju.edu.cn.
  • Jijun Tang
    School of Computer Science and Engineering, Tianjin University, Tianjin, 300072, China. jtang@cse.sc.edu.
  • Fei Guo
    School of Electronic Information Engineering, Tianjin University, Tianjin 300072, China. Electronic address: gfjy001@yahoo.com.