Identification of DNA-binding proteins via Multi-view LSSVM with independence criterion.

Journal: Methods (San Diego, Calif.)
Published Date:

Abstract

DNA-binding proteins actively participate in life activities such as DNA replication, recombination, gene expression and regulation and play a prominent role in these processes. As DNA-binding proteins continue to be discovered and increase, it is imperative to design an efficient and accurate identification tool. Considering the time-consuming and expensive traditional experimental technology and the insufficient number of samples in the biological computing method based on structural information, we proposed a machine learning algorithm based on sequence information to identify DNA binding proteins, named multi-view Least Squares Support Vector Machine via Hilbert-Schmidt Independence Criterion (multi-view LSSVM via HSIC). This method took 6 feature sets as multi-view input and trains a single view through the LSSVM algorithm. Then, we integrated HSIC into LSSVM as a regular term to reduce the dependence between views and explored the complementary information of multiple views. Subsequently, we trained and coordinated the submodels and finally combined the submodels in the form of weights to obtain the final prediction model. On training set PDB1075, the prediction results of our model were better than those of most existing methods. Independent tests are conducted on the datasets PDB186 and PDB2272. The accuracy of the prediction results was 85.5% and 79.36%, respectively. This result exceeded the current state-of-the-art methods, which showed that the multi-view LSSVM via HSIC can be used as an efficient predictor.

Authors

  • Shulin Zhao
    State Key Laboratory for the Chemistry and Molecular Engineering of Medicinal Resources, Key Laboratory of Ecology of Rare and Endangered Species and Environmental Protection of Ministry Education, Guangxi Normal University, Guilin 541004, China. Electronic address: zhaoshulin001@163.com.
  • Yu Zhang
    College of Marine Electrical Engineering, Dalian Maritime University, Dalian, China.
  • Yijie Ding
    School of Computer Science and Technology, Tianjin University, Tianjin 300350, China. wuxi_dyj@tju.edu.cn.
  • Quan Zou
  • Lijia Tang
    Southwest Medical University, Luzhou, China.
  • Qing Liu
    School of Chemistry and Chemical Engineering, Shandong University of Technology, 255049, Zibo, PR China.
  • Ying Zhang
    Department of Nephrology, Nanchong Central Hospital Affiliated to North Sichuan Medical College, Nanchong, China.