Validation of genetic variants from NGS data using deep convolutional neural networks.

Journal: BMC bioinformatics
Published Date:

Abstract

Accurate somatic variant calling from next-generation sequencing data is one most important tasks in personalised cancer therapy. The sophistication of the available technologies is ever-increasing, yet, manual candidate refinement is still a necessary step in state-of-the-art processing pipelines. This limits reproducibility and introduces a bottleneck with respect to scalability. We demonstrate that the validation of genetic variants can be improved using a machine learning approach resting on a Convolutional Neural Network, trained using existing human annotation. In contrast to existing approaches, we introduce a way in which contextual data from sequencing tracks can be included into the automated assessment. A rigorous evaluation shows that the resulting model is robust and performs on par with trained researchers following published standard operating procedure.

Authors

  • Marc Vaisband
    Department of Internal Medicine III with Haematology, Medical Oncology, Haemostaseology, Infectiology and Rheumatology, Oncologic Center; Salzburg Cancer Research Institute - Laboratory for Immunological and Molecular Cancer Research (SCRI-LIMCR); Cancer Cluster Salzburg, Paracelsus Medical University, Salzburg, Austria. vaisband@uni-bonn.de.
  • Maria Schubert
    Department of Internal Medicine III with Haematology, Medical Oncology, Haemostaseology, Infectiology and Rheumatology, Oncologic Center; Salzburg Cancer Research Institute - Laboratory for Immunological and Molecular Cancer Research (SCRI-LIMCR); Cancer Cluster Salzburg, Paracelsus Medical University, Salzburg, Austria.
  • Franz Josef Gassner
    Department of Internal Medicine III with Haematology, Medical Oncology, Haemostaseology, Infectiology and Rheumatology, Oncologic Center; Salzburg Cancer Research Institute - Laboratory for Immunological and Molecular Cancer Research (SCRI-LIMCR); Cancer Cluster Salzburg, Paracelsus Medical University, Salzburg, Austria.
  • Roland Geisberger
    Salzburg Cancer Research Institute-Laboratory for Immunological and Molecular Cancer Research (SCRI-LIMCR), 5020 Salzburg, Austria.
  • Richard Greil
    Department of Internal Medicine III with Haematology, Medical Oncology, Haemostaseology, Infectiology and Rheumatology, Oncologic Center, Paracelsus Medical University, 5020 Salzburg, Austria.
  • Nadja Zaborsky
    Salzburg Cancer Research Institute-Laboratory for Immunological and Molecular Cancer Research (SCRI-LIMCR), 5020 Salzburg, Austria.
  • Jan Hasenauer
    Helmholtz Zentrum München - German Research Center for Environmental Health, Institute of Computational Biology, 85764, Neuherberg, Germany. jan.hasenauer@uni-bonn.de.