DeepSol: a deep learning framework for sequence-based protein solubility prediction.

Journal: Bioinformatics (Oxford, England)
Published Date:

Abstract

MOTIVATION: Protein solubility plays a vital role in pharmaceutical research and production yield. For a given protein, the extent of its solubility can represent the quality of its function, and is ultimately defined by its sequence. Thus, it is imperative to develop novel, highly accurate in silico sequence-based protein solubility predictors. In this work we propose, DeepSol, a novel Deep Learning-based protein solubility predictor. The backbone of our framework is a convolutional neural network that exploits k-mer structure and additional sequence and structural features extracted from the protein sequence.

Authors

  • Sameer Khurana
    Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA.
  • Reda Rawi
    Vaccine Research Center, National Institute of Allergy and Infectious Diseases, National Institute of Health, Bethesda, MD, USA.
  • Khalid Kunji
    Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha, Qatar.
  • Gwo-Yu Chuang
    Vaccine Research Center, National Institute of Allergy and Infectious Diseases, National Institute of Health, Bethesda, MD, USA.
  • Halima Bensmail
    Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha, Qatar.
  • Raghvendra Mall
    Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha, Qatar.