Laplacian Regularized Sparse Representation Based Classifier for Identifying DNA N4-Methylcytosine Sites via L-Matrix Norm.

Journal: IEEE/ACM transactions on computational biology and bioinformatics
Published Date:

Abstract

N4-methylcytosine (4mC) is one of important epigenetic modifications in DNA sequences. Detecting 4mC sites is time-consuming. The computational method based on machine learning has provided effective help for identifying 4mC. To further improve the performance of prediction, we propose a Laplacian Regularized Sparse Representation based Classifier with L-matrix norm (LapRSRC). We also utilize kernel trick to derive the kernel LapRSRC for nonlinear modeling. Matrix factorization technology is employed to solve the sparse representation coefficients of all test samples in the training set. And an efficient iterative algorithm is proposed to solve the objective function. We implement our model on six benchmark datasets of 4mC and eight UCI datasets to evaluate performance. The results show that the performance of our method is better or comparable.

Authors

  • Yijie Ding
    School of Computer Science and Technology, Tianjin University, Tianjin 300350, China. wuxi_dyj@tju.edu.cn.
  • Wenying He
    School of Computer Science and Technology, Tianjin University, Tianjin, China.
  • Jijun Tang
    School of Computer Science and Engineering, Tianjin University, Tianjin, 300072, China. jtang@cse.sc.edu.
  • Quan Zou
  • Fei Guo
    School of Electronic Information Engineering, Tianjin University, Tianjin 300072, China. Electronic address: gfjy001@yahoo.com.