AncestryGeni: A novel genetic ancestry classification pipeline for small and noisy sequence data.

Journal: Bioinformatics (Oxford, England)
Published Date:

Abstract

MOTIVATION: Efforts to address health disparities are often limited by the lack of robust computational tools for inferring genetic ancestry by calculating an individual's genetic similarity to continental groups. We have already shown that a preferred alternative to self-described race is using ancestry informative markers (AIMs) that can be classified into ancestral components and used to estimate their similarity to those of known populations to identify continental groups. However, real-world genomic data can present challenges, including limited availability of germline DNA, a small number of AIMs for each sample, and the use of different variant calling software, limiting the application of existing solutions.

Authors

  • Eran Elhaik
    Department of Biology, Lund University, Sölvegatan 35, 22362 Lund, Sweden.
  • Sara Behnamian
    Centre for GeoGenetics, Globe Institute, University of Copenhagen, Denmark.
  • Michael Howe
    Division of Hematology, Department of Internal Medicine, Mayo Clinic, Rochester, MN.
  • Hongwei Tang
    State Key Laboratory of ASIC and System, School of Microelectronics, Fudan University, Shanghai 200433, China.
  • Huihuang Yan
    Division of Computational Biology, Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN.
  • Shulan Tian
    Division of Computational Biology, Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN.
  • Suganti Shivaram
    Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN 55905, USA.
  • Cinthya Zepeda Mendoza
    Division of Hematopathology, Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN.
  • Kylee MacLachlan
    Myeloma Service, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY.
  • Saad Usmani
    Department of Internal Medicine, Adult Bone Marrow Transplant Service, Memorial Sloan Kettering Cancer Center, New York, NY.
  • Mehdi Pirooznia
    Bioinformatics and Computational Biology Core, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, 20892, USA. mehdi.pirooznia@nih.gov.
  • Gareth Morgan
    Multiple Myeloma Research Program, Perlmutter Cancer Center, NYU Langone Medical Center, New York, NY.
  • Patrick Blaney
    Multiple Myeloma Research Program, Perlmutter Cancer Center, NYU Langone Medical Center, New York, NY.
  • Francesco Maura
    Myeloma Program, Sylvester Comprehensive Cancer Center, University of Miami, Miami, FL, USA.
  • Linda B Baughn
    Division of Hematopathology, Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, USA.

Keywords

No keywords available for this article.