A deep learning approach reveals unexplored landscape of viral expression in cancer.

Journal: Nature communications
Published Date:

Abstract

About 15% of human cancer cases are attributed to viral infections. To date, virus expression in tumor tissues has been mostly studied by aligning tumor RNA sequencing reads to databases of known viruses. To allow identification of divergent viruses and rapid characterization of the tumor virome, we develop viRNAtrap, an alignment-free pipeline to identify viral reads and assemble viral contigs. We utilize viRNAtrap, which is based on a deep learning model trained to discriminate viral RNAseq reads, to explore viral expression in cancers and apply it to 14 cancer types from The Cancer Genome Atlas (TCGA). Using viRNAtrap, we uncover expression of unexpected and divergent viruses that have not previously been implicated in cancer and disclose human endogenous viruses whose expression is associated with poor overall survival. The viRNAtrap pipeline provides a way forward to study viral infections associated with different clinical conditions.

Authors

  • Abdurrahman Elbasir
    College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar.
  • Ying Ye
    The Wistar Institute, Philadelphia, PA, 19104, USA.
  • Daniel E Schäffer
    The Wistar Institute, Philadelphia, PA, 19104, USA.
  • Xue Hao
    The Wistar Institute, Philadelphia, PA, 19104, USA.
  • Jayamanna Wickramasinghe
    The Wistar Institute, Philadelphia, PA, 19104, USA.
  • Konstantinos Tsingas
    The Wistar Institute, Philadelphia, PA, 19104, USA.
  • Paul M Lieberman
    The Wistar Institute, Philadelphia, PA, 19104, USA.
  • Qi Long
    Department of Biostatistics, Epidemiology & Informatics, University of Pennsylvania, Philadelphia, USA.
  • Quaid Morris
    Department of Electrical and Computer Engineering, University of Toronto, Toronto, Ontario M5S 3G4, Canada. Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada. Program on Genetic Networks and Program on Neural Computation & Adaptive Perception, Canadian Institute for Advanced Research, Toronto, Ontario M5G 1Z8, Canada. Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada.
  • Rugang Zhang
    The Wistar Institute, Philadelphia, PA, 19104, USA.
  • Alejandro A Schäffer
    Cancer Data Science Laboratory (CDSL), National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, USA.
  • Noam Auslander
    National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.