A systematic exploration of [Formula: see text] cutoff ranges in machine learning models for protein mutation stability prediction.

Journal: Journal of bioinformatics and computational biology
Published Date:

Abstract

Discerning how a mutation affects the stability of a protein is central to the study of a wide range of diseases. Mutagenesis experiments on physical proteins provide precise insights about the effects of amino acid substitutions, but such studies are time and cost prohibitive. Computational approaches for informing experimentalists where to allocate wet-lab resources are available, including a variety of machine learning models. Assessing the accuracy of machine learning models for predicting the effects of mutations is dependent on experiments for amino acid substitutions performed in vitro. When similar experiments on physical proteins have been performed by multiple laboratories, the use of the data near the juncture of stabilizing and destabilizing mutations is questionable. In this work, we explore a systematic and principled alternative to discarding experimental data close to the juncture of stabilizing and destabilizing mutations. We model the inconclusive range of experimental [Formula: see text] values via 3- and 5-way classifiers, and systematically explore potential boundaries for the range of inconclusive experimental values. We demonstrate the effectiveness of potential boundaries through confusion matrices and heat map visualizations. We explore two novel metrics for assessing viable cutoff ranges, and find that under these metrics, a lower cutoff near [Formula: see text] and an upper cutoff near [Formula: see text] are optimal across multiple machine learning models.

Authors

  • Richard Olney
    * Western Washington University, Bellingham, WA, USA.
  • Aaron Tuor
    † Pacific Northwest National Laboratory, Seattle, WA, USA.
  • Filip Jagodzinski
    Department of Computer Science, Western Washington University, Bellingham, WA 98225, USA. filip.jagodzinski@wwu.edu.
  • Brian Hutchinson
    Department of Computer Science, Western Washington University, Bellingham, WA 98225, USA. Brian.Hutchinson@wwu.edu.