Using Machine Learning to Identify True Somatic Variants from Next-Generation Sequencing.

Journal: Clinical chemistry
Published Date:

Abstract

BACKGROUND: Molecular profiling has become essential for tumor risk stratification and treatment selection. However, cancer genome complexity and technical artifacts make identification of real variants a challenge. Currently, clinical laboratories rely on manual screening, which is costly, subjective, and not scalable. We present a machine learning-based method to distinguish artifacts from bona fide single-nucleotide variants (SNVs) detected by next-generation sequencing from nonformalin-fixed paraffin-embedded tumor specimens.

Authors

  • Chao Wu
  • Xiaonan Zhao
    Division of Genomic Diagnostics, The Children's Hospital of Philadelphia, Philadelphia, PA.
  • Mark Welsh
    Division of Genomic Diagnostics, The Children's Hospital of Philadelphia, Philadelphia, PA.
  • Kellianne Costello
    College of Science and Technology, Temple University, Philadelphia, PA.
  • Kajia Cao
    State Key Laboratory of Oncology in South China, Collaborative Innovation Center of Cancer Medicine, Guangzhou, 510060, P. R. China.
  • Ahmad Abou Tayoun
    Department of Genetics, Al Jalila Children's Specialty Hospital, Dubai, UAE.
  • Marilyn Li
    Division of Genomic Diagnostics, The Children's Hospital of Philadelphia, Philadelphia, PA.
  • Mahdi Sarmady
    Division of Genomic Diagnostics, The Children's Hospital of Philadelphia, Philadelphia, PA.