Enhanced Integrated Gradients: improving interpretability of deep learning models using splicing codes as a case study.

Journal: Genome biology
Published Date:

Abstract

Despite the success and fast adaptation of deep learning models in biomedical domains, their lack of interpretability remains an issue. Here, we introduce Enhanced Integrated Gradients (EIG), a method to identify significant features associated with a specific prediction task. Using RNA splicing prediction as well as digit classification as case studies, we demonstrate that EIG improves upon the original Integrated Gradients method and produces sets of informative features. We then apply EIG to identify A1CF as a key regulator of liver-specific alternative splicing, supporting this finding with subsequent analysis of relevant A1CF functional (RNA-seq) and binding data (PAR-CLIP).

Authors

  • Anupama Jha
    Department of Computer and Information Science, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, USA.
  • Joseph K Aicher
    Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, USA.
  • Matthew R Gazzara
    Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, USA.
  • Deependra Singh
    Department of Computer and Information Science, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, USA.
  • Yoseph Barash
    Department of Electrical and Computer Engineering, University of Toronto, Toronto, Ontario M5S 3G4, Canada. Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada. School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.