Efficient utilization on PSSM combining with recurrent neural network for membrane protein types prediction.

Journal: Computational biology and chemistry
Published Date:

Abstract

Position-Specific Scoring Matrix (PSSM) is an excellent feature extraction method that was proposed early in protein classifying prediction, but within the restriction of feature shape in PSSM, researchers make a lot attempts to process it so that PSSM can be input to the traditional machine learning algorithms. These processes drop information provided by PSSM in a way thus the feature representation is limited. Moreover, the high-dimensional feature representation of PSSM makes it incompatible with other feature extraction methods. We use the PSSM as the input of Recurrent Neural Network without any post-processing, the amino acids in protein sequences are regarded as time step in RNN. This way takes full advantage of the information that PSSM provides. In this study, the PSSM is input to the model directly and the internal information of PSSM is fully utilized, we propose an end-to-end solution and achieve state-of-the-art performance. Ultimately, the exploration of how to combine PSSM with traditional feature extraction methods is carried out and achieve slightly improved performance. Our network architecture is implemented in Python and is available at https://github.com/YellowcardD/RNN-for-membrane-protein-types-prediction.

Authors

  • Shunfang Wang
  • Mingyuan Li
  • Lei Guo
    Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
  • Zicheng Cao
  • Yu Fei
    School of Statistics and Mathematics, Yunnan University of Finance and Economics, Kunming 650221, PR China. Electronic address: feiyukm@aliyun.com.