pyRBDome: a comprehensive computational platform for enhancing RNA-binding proteome data.

Journal: Life science alliance
PMID:

Abstract

High-throughput proteomics approaches have revolutionised the identification of RNA-binding proteins (RBPome) and RNA-binding sequences (RBDome) across organisms. Yet, the extent of noise, including false positives, associated with these methodologies, is difficult to quantify as experimental approaches for validating the results are generally low throughput. To address this, we introduce pyRBDome, a pipeline for enhancing RNA-binding proteome data in silico. It aligns the experimental results with RNA-binding site (RBS) predictions from distinct machine-learning tools and integrates high-resolution structural data when available. Its statistical evaluation of RBDome data enables quick identification of likely genuine RNA-binders in experimental datasets. Furthermore, by leveraging the pyRBDome results, we have enhanced the sensitivity and specificity of RBS detection through training new ensemble machine-learning models. pyRBDome analysis of a human RBDome dataset, compared with known structural data, revealed that although UV-cross-linked amino acids were more likely to contain predicted RBSs, they infrequently bind RNA in high-resolution structures. This discrepancy underscores the limitations of structural data as benchmarks, positioning pyRBDome as a valuable alternative for increasing confidence in RBDome datasets.

Authors

  • Liang-Cui Chu
    Centre for Engineering Biology, University of Edinburgh, Edinburgh, UK.
  • Niki Christopoulou
    Centre for Engineering Biology, University of Edinburgh, Edinburgh, UK.
  • Hugh McCaughan
    Centre for Engineering Biology, University of Edinburgh, Edinburgh, UK.
  • Sophie Winterbourne
    Institute of Quantitative Biology, Biochemistry and Biotechnology, University of Edinburgh, Edinburgh, UK.
  • Davide Cazzola
    Centre for Engineering Biology, University of Edinburgh, Edinburgh, UK.
  • Shichao Wang
    Centre for Engineering Biology, University of Edinburgh, Edinburgh, UK.
  • Ulad Litvin
    Centre for Engineering Biology, University of Edinburgh, Edinburgh, UK.
  • Salomé Brunon
    Centre for Engineering Biology, University of Edinburgh, Edinburgh, UK.
  • Patrick Jb Harker
    Centre for Engineering Biology, University of Edinburgh, Edinburgh, UK.
  • Iain McNae
    Institute of Quantitative Biology, Biochemistry and Biotechnology, University of Edinburgh, Edinburgh, UK.
  • Sander Granneman
    Centre for Engineering Biology, University of Edinburgh, Edinburgh, UK Sander.Granneman@ed.ac.uk.