XOmiVAE: an interpretable deep learning model for cancer classification using high-dimensional omics data.

Journal: Briefings in bioinformatics
Published Date:

Abstract

The lack of explainability is one of the most prominent disadvantages of deep learning applications in omics. This 'black box' problem can undermine the credibility and limit the practical implementation of biomedical deep learning models. Here we present XOmiVAE, a variational autoencoder (VAE)-based interpretable deep learning model for cancer classification using high-dimensional omics data. XOmiVAE is capable of revealing the contribution of each gene and latent dimension for each classification prediction and the correlation between each gene and each latent dimension. It is also demonstrated that XOmiVAE can explain not only the supervised classification but also the unsupervised clustering results from the deep learning network. To the best of our knowledge, XOmiVAE is one of the first activation level-based interpretable deep learning models explaining novel clusters generated by VAE. The explainable results generated by XOmiVAE were validated by both the performance of downstream tasks and the biomedical knowledge. In our experiments, XOmiVAE explanations of deep learning-based cancer classification and clustering aligned with current domain knowledge including biological annotation and academic literature, which shows great potential for novel biomedical knowledge discovery from deep learning models.

Authors

  • Eloise Withnell
    Data Science Institute Imperial College London, SW7 2AZ London, UK.
  • Xiaoyu Zhang
    First Department of Infectious Diseases, The First Affiliated Hospital of China Medical University, Shenyang, China.
  • Kai Sun
    Department of Materials Science and Engineering, Jinan University.
  • Yike Guo
    Department of Computing, Imperial College, London SW7 2AZ, UK. y.guo@imperial.ac.uk.