EvoStruct-Sub: An accurate Gram-positive protein subcellular localization predictor using evolutionary and structural features.

Journal: Journal of theoretical biology
Published Date:

Abstract

Determining subcellular localization of proteins is considered as an important step towards understanding their functions. Previous studies have mainly focused solely on Gene Ontology (GO) as the main feature to tackle this problem. However, it was shown that features extracted based on GO is hard to be used for new proteins with unknown GO. At the same time, evolutionary information extracted from Position Specific Scoring Matrix (PSSM) have been shown as another effective features to tackle this problem. Despite tremendous advancement using these sources for feature extraction, this problem still remains unsolved. In this study we propose EvoStruct-Sub which employs predicted structural information in conjunction with evolutionary information extracted directly from the protein sequence to tackle this problem. To do this we use several different feature extraction method that have been shown promising in subcellular localization as well as similar studies to extract effective local and global discriminatory information. We then use Support Vector Machine (SVM) as our classification technique to build EvoStruct-Sub. As a result, we are able to enhance Gram-positive subcellular localization prediction accuracies by up to 5.6% better than previous studies including the studies that used GO for feature extraction.

Authors

  • Md Raihan Uddin
    Department of Computer Science and Engineering, United International University, Bangladesh.
  • Alok Sharma
    Center for Integrative Medical Sciences, RIKEN Yokohama, Yokohama, 230-0045, Japan.
  • Dewan Md Farid
    Department of Computer Science and Engineering, United International University, Bangladesh.
  • Md Mahmudur Rahman
    Lister Hill National Center for Biomedical Communications, U.S. National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
  • Abdollah Dehzangi
    1] Signal Processing Laboratory, School of Engineering, Griffith University, Brisbane, Australia [2] Institute for Integrated and Intelligent Systems, Griffith University, Brisbane, Australia.
  • Swakkhar Shatabda
    Department of Computer Science and Engineering, United International University, House 80, Road 8A, Dhanmondi, Dhaka-1209, Bangladesh. Electronic address: swakkhar@cse.uiu.ac.bd.