CNN-Pred: Prediction of single-stranded and double-stranded DNA-binding protein using convolutional neural networks.

Journal: Gene
PMID:

Abstract

DNA-binding proteins play a vital role in biological activity including DNA replication, DNA packing, and DNA reparation. DNA-binding proteins can be classified into single-stranded DNA-binding proteins (SSBs) or double-stranded DNA-binding proteins (DSBs). Determining whether a protein is DSB or SSB helps determine the protein's function. Therefore, many studies have been conducted to accurately identify DSB and SSB in recent years. Despite all the efforts have been made so far, the DSB and SSB prediction performance remains limited. In this study, we propose a new method called CNN-Pred to accurately predict DSB and SSB. To build CNN-Pred, we first extract evolutionary-based features in the form of mono-gram and bi-gram profiles using position specific scoring matrix (PSSM). We then, use 1D-convolutional neural network (CNN) as the classifier to our extracted features. Our results demonstrate that CNN-Pred can enhance the DSB and SSB prediction accuracies by more than 4%, on the independent test compared to previous studies found in the literature. CNN-pred as a standalone tool and all its source codes are publicly available at: https://github.com/MLBC-lab/CNN-Pred.

Authors

  • Farnoush Manavi
    Computer Science and Engineering and Information Technology Department, Shiraz University, Shiraz, Iran.
  • Alok Sharma
    Center for Integrative Medical Sciences, RIKEN Yokohama, Yokohama, 230-0045, Japan.
  • Ronesh Sharma
  • Tatsuhiko Tsunoda
    Center for Integrative Medical Sciences, RIKEN Yokohama, Yokohama, 230-0045, Japan. tatsuhiko.tsunoda@riken.jp.
  • Swakkhar Shatabda
    Department of Computer Science and Engineering, United International University, House 80, Road 8A, Dhanmondi, Dhaka-1209, Bangladesh. Electronic address: swakkhar@cse.uiu.ac.bd.
  • Iman Dehzangi
    Department of Computer Science, Rutgers University, Camden, NJ, United States.