A universal SNP and small-indel variant caller using deep neural networks.

Journal: Nature biotechnology
Published Date:

Abstract

Despite rapid advances in sequencing technologies, accurately calling genetic variants present in an individual genome from billions of short, errorful sequence reads remains challenging. Here we show that a deep convolutional neural network can call genetic variation in aligned next-generation sequencing read data by learning statistical relationships between images of read pileups around putative variant and true genotype calls. The approach, called DeepVariant, outperforms existing state-of-the-art tools. The learned model generalizes across genome builds and mammalian species, allowing nonhuman sequencing projects to benefit from the wealth of human ground-truth data. We further show that DeepVariant can learn to call variants in a variety of sequencing technologies and experimental designs, including deep whole genomes from 10X Genomics and Ion Ampliseq exomes, highlighting the benefits of using more automated and generalizable techniques for variant calling.

Authors

  • Ryan Poplin
    Google Research, Google, Mountain View, CA, USA.
  • Pi-Chuan Chang
    Google Inc., Mountain View, CA 94043, USA.
  • David Alexander
    Department of Biomolecular Engineering, The University of California at Santa Cruz, Santa Cruz, California, United States of America.
  • Scott Schwartz
    Google Inc., Mountain View, California, USA.
  • Thomas Colthurst
    Google Inc., Mountain View, California, USA.
  • Alexander Ku
    Google Inc., Mountain View, California, USA.
  • Dan Newburger
    Verily Life Sciences, Mountain View, California, USA.
  • Jojo Dijamco
    Verily Life Sciences, Mountain View, California, USA.
  • Nam Nguyen
    Verily Life Sciences, Mountain View, California, USA.
  • Pegah T Afshar
    Verily Life Sciences, Mountain View, California, USA.
  • Sam S Gross
    Verily Life Sciences, Mountain View, California, USA.
  • Lizzie Dorfman
    Verily Life Sciences, Mountain View, California, USA.
  • Cory Y McLean
    Google Brain, Cambridge, Massachusetts 02142, USA.
  • Mark A DePristo
    Verily Life Sciences, Mountain View, California, USA.