Accurate somatic variant detection using weakly supervised deep learning.

Journal: Nature communications
Published Date:

Abstract

Identification of somatic mutations in tumor samples is commonly based on statistical methods in combination with heuristic filters. Here we develop VarNet, an end-to-end deep learning approach for identification of somatic variants from aligned tumor and matched normal DNA reads. VarNet is trained using image representations of 4.6 million high-confidence somatic variants annotated in 356 tumor whole genomes. We benchmark VarNet across a range of publicly available datasets, demonstrating performance often exceeding current state-of-the-art methods. Overall, our results demonstrate how a scalable deep learning approach could augment and potentially supplant human engineered features and heuristic filters in somatic variant calling.

Authors

  • Kiran Krishnamachari
    Department of Computational and Systems Biology, Agency for Science Technology and Research, Genome Institute of Singapore, Singapore, Singapore.
  • Dylan Lu
    Department of Computational and Systems Biology, Agency for Science Technology and Research, Genome Institute of Singapore, Singapore, Singapore.
  • Alexander Swift-Scott
    Department of Computational and Systems Biology, Agency for Science Technology and Research, Genome Institute of Singapore, Singapore, Singapore.
  • Anuar Yeraliyev
    Department of Computational and Systems Biology, Agency for Science Technology and Research, Genome Institute of Singapore, Singapore, Singapore.
  • Kayla Lee
    Department of Computational and Systems Biology, Agency for Science Technology and Research, Genome Institute of Singapore, Singapore, Singapore.
  • Weitai Huang
    Computational and Systems Biology 3, Genome Institute of Singapore, A∗STAR (Agency for Science, Technology and Research), Singapore, Singapore. huangwt@gis.a-star.edu.sg.
  • Sim Ngak Leng
    Department of Computational and Systems Biology, Agency for Science Technology and Research, Genome Institute of Singapore, Singapore, Singapore.
  • Anders Jacobsen Skanderup
    Computational and Systems Biology 3, Genome Institute of Singapore, A∗STAR (Agency for Science, Technology and Research), Singapore, Singapore.