Ensemble-Based Somatic Mutation Calling in Cancer Genomes.

Journal: Methods in molecular biology (Clifton, N.J.)
Published Date:

Abstract

Identification of somatic mutations in tumor tissue is challenged by both technical artifacts, diverse somatic mutational processes, and genetic heterogeneity in the tumors. Indeed, recent independent benchmark studies have revealed low concordance between different somatic mutation callers. Here, we describe Somatic Mutation calling method using a Random Forest (SMuRF), a portable ensemble method that combines the predictions and auxiliary features from individual mutation callers using supervised machine learning. SMuRF has improved prediction accuracy for both somatic point mutations (single nucleotide variants; SNVs) and small insertions/deletions (indels) in cancer genomes and exomes. Here, we describe the method and provide a tutorial on the installation and application of SMuRF.

Authors

  • Weitai Huang
    Computational and Systems Biology 3, Genome Institute of Singapore, A∗STAR (Agency for Science, Technology and Research), Singapore, Singapore. huangwt@gis.a-star.edu.sg.
  • Yu Amanda Guo
    Computational and Systems Biology 3, Genome Institute of Singapore, A∗STAR (Agency for Science, Technology and Research), Singapore, Singapore.
  • Mei Mei Chang
    Computational and Systems Biology 3, Genome Institute of Singapore, A∗STAR (Agency for Science, Technology and Research), Singapore, Singapore.
  • Anders Jacobsen Skanderup
    Computational and Systems Biology 3, Genome Institute of Singapore, A∗STAR (Agency for Science, Technology and Research), Singapore, Singapore.