Robust principal component analysis-based prediction of protein-protein interaction hot spots.

Journal: Proteins
Published Date:

Abstract

Proteins often exert their function by binding to other cellular partners. The hot spots are key residues for protein-protein binding. Their identification may shed light on the impact of disease associated mutations on protein complexes and help design protein-protein interaction inhibitors for therapy. Unfortunately, current machine learning methods to predict hot spots, suffer from limitations caused by gross errors in the data matrices. Here, we present a novel data pre-processing pipeline that overcomes this problem by recovering a low rank matrix with reduced noise using Robust Principal Component Analysis. Application to existing databases shows the predictive power of the method.

Authors

  • Divya Sitani
    JARA-Institute: Molecular Neuroscience and Neuroimaging, Institute for Neuroscience and Medicine INM-11/JARA-BRAIN Institute JBI-2, Forschungszentrum Jülich GmbH, Jülich, Germany.
  • Alejandro Giorgetti
    Institute for Advanced Simulations IAS-5 / Institute for Neuroscience and Medicine INM-9, Computational Biomedicine, Forschungszentrum Jülich GmbH, Jülich, Germany.
  • Mercedes Alfonso-Prieto
    Institute for Advanced Simulations IAS-5 / Institute for Neuroscience and Medicine INM-9, Computational Biomedicine, Forschungszentrum Jülich GmbH, Jülich, Germany.
  • Paolo Carloni
    JARA-Institute: Molecular Neuroscience and Neuroimaging, Institute for Neuroscience and Medicine INM-11/JARA-BRAIN Institute JBI-2, Forschungszentrum Jülich GmbH, Jülich, Germany.