iRSpot-PDI: Identification of recombination spots by incorporating dinucleotide property diversity information into Chou's pseudo components.

Journal: Genomics
Published Date:

Abstract

Recombination spot identification plays an important role in revealing genome evolution and developing DNA function study. Although some computational methods have been proposed, extracting discriminatory information embedded in DNA properties has not received enough attention. The DNA properties include dinucleotide flexibility, structure and thermodynamic parameter, which are significant for genome evolution research. To explore the potential effect of DNA properties, a novel feature extraction method, called iRSpot-PDI, is proposed. A wrapper feature selection method with the best first search is used to identify the best feature set. To verify the effectiveness of the proposed method, support vector machine is employed on the obtained features. Prediction results are reported on two benchmark datasets. Compared with the recently reported methods, iRSpot-PDI achieves the highest values of individual specificity, Matthew's correlation coefficient and overall accuracy. The experimental results confirm that iRSpot-PDI is effective for accurate identification of recombination spots. The datasets can be downloaded from the following URL: http://stxy.neuq.edu.cn/info/1095/1157.htm.

Authors

  • Lichao Zhang
    School of Mathematics and Statistics, Northeastern University at Qinhuangdao, Qinhuangdao 066004, PR China. Electronic address: zhanglichaoouc@neuq.edu.cn.
  • Liang Kong
    School of Mathematics and Information Science & Technology, Hebei Normal University of Science & Technology, Qinhuangdao 066004, PR China. Electronic address: kongliangouc@hevttc.edu.cn.