Challenges and Opportunities for Bayesian Statistics in Proteomics.

Journal: Journal of proteome research
Published Date:

Abstract

Proteomics is a data-rich science with complex experimental designs and an intricate measurement process. To obtain insights from the large data sets produced, statistical methods, including machine learning, are routinely applied. For a quantity of interest, many of these approaches only produce a point estimate, such as a mean, leaving little room for more nuanced interpretations. By contrast, Bayesian statistics allows quantification of uncertainty through the use of probability distributions. These probability distributions enable scientists to ask complex questions of their proteomics data. Bayesian statistics also offers a modular framework for data analysis by making dependencies between data and parameters explicit. Hence, specifying complex hierarchies of parameter dependencies is straightforward in the Bayesian framework. This allows us to use a statistical methodology which equals, rather than neglects, the sophistication of experimental design and instrumentation present in proteomics. Here, we review Bayesian methods applied to proteomics, demonstrating their potential power, alongside the challenges posed by adopting this new statistical framework. To illustrate our review, we give a walk-through of the development of a Bayesian model for dynamic organic orthogonal phase-separation (OOPS) data.

Authors

  • Oliver M Crook
    Department of Statistics, University of Oxford, Oxford OX1 3LB, United Kingdom.
  • Chun-Wa Chung
    Structural and Biophysical Sciences, GlaxoSmithKline R&D, Stevenage SG1 2NY, United Kingdom.
  • Charlotte M Deane
    Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford, United Kingdom.