Convolutional neural networks with image representation of amino acid sequences for protein function prediction.

Journal: Computational biology and chemistry
Published Date:

Abstract

Proteins are one of the most important molecules that govern the cellular processes in most of the living organisms. Various functions of the proteins are of paramount importance to understand the basics of life. Several supervised learning approaches are applied in this field to predict the functionality of proteins. In this paper, we propose a convolutional neural network based approach ProtConv to predict the functionality of proteins by converting the amino-acid sequences to a two dimensional image. We have used a protein embedding technique using transfer learning to generate the feature vector. Feature vector is then converted into a square sized single channel image to be fed into a convolutional network. The neural network architecture used here is a combination of convolutional filters and average pooling layers followed by dense fully connected layers to predict a binary function. We have performed experiments on standard benchmark datasets taken from two very important protein function prediction task: proinflammatory cytokines and anticancer peptides. Our experiments show that the proposed method, ProtConv achieves state-of-the-art performances on both of the datasets. All necessary details about implementation with source code and datasets are made available at: https://github.com/swakkhar/ProtConv.

Authors

  • Samia Tasnim Sara
    Department of Computer Science and Engineering, United International University, Plot-2, United City, Madani Avenue, Badda, Dhaka 1212, Bangladesh.
  • Md Mehedi Hasan
    Nutrition and Clinical Services Division, International Center for Diarrheal Disease and Research, Bangladesh (icddr,b), Dhaka, Bangladesh.
  • Ahsan Ahmad
    Department of Computer Science and Engineering, United International University, Plot 2, United City, Madani Avenue, Satarkul, Badda, Dhaka, 1212, Bangladesh. Electronic address: ahsan1037@gmail.com.
  • Swakkhar Shatabda
    Department of Computer Science and Engineering, United International University, House 80, Road 8A, Dhanmondi, Dhaka-1209, Bangladesh. Electronic address: swakkhar@cse.uiu.ac.bd.