fastISM: performant in silico saturation mutagenesis for convolutional neural networks.

Journal: Bioinformatics (Oxford, England)
Published Date:

Abstract

MOTIVATION: Deep-learning models, such as convolutional neural networks, are able to accurately map biological sequences to associated functional readouts and properties by learning predictive de novo representations. In silico saturation mutagenesis (ISM) is a popular feature attribution technique for inferring contributions of all characters in an input sequence to the model's predicted output. The main drawback of ISM is its runtime, as it involves multiple forward propagations of all possible mutations of each character in the input sequence through the trained model to predict the effects on the output.

Authors

  • Surag Nair
    Department of Computer Science, Stanford University, Stanford, CA 94305, USA.
  • Avanti Shrikumar
    Department of Computer Science, Stanford University, Stanford, CA, USA.
  • Jacob Schreiber
    Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, USA.
  • Anshul Kundaje
    Department of Computer Science, Stanford University, Stanford, CA, USA.