Expanding biobank pharmacogenomics through machine learning calls of structural variation.

Journal: Genetics
Published Date:

Abstract

Biobanks linking genetic data with clinical health records provide exciting opportunities for pharmacogenomic (PGx) research on genetic variation and drug response. Designed as central and multi-use resources, biobanks can facilitate diverse PGx research efforts, including the study of drug efficacy and adverse effects. Specialized PGx alleles and phenotypes are critical for such studies and can be conveniently called from existing array-based genotypes routinely collected in most biobanks. We describe a central callset of PGx alleles and phenotypes in over 80,000 participants of the Michigan Genomics Initiative (MGI) biobank, created using the PyPGx software on TOPMed imputed genotypes. The array-based PGx allele calls demonstrate concordance (>92%) with a set of PCR-validated alleles collected during clinical care, but do not identify PGx alleles dependent on structural variation, including the clinically important CYP2D6*5 deletion. To address this, we developed a support vector machine trained on genotype array SNV probe intensities to classify CYP2D6*5 carriers. This method had >99% accuracy and reclassified ∼7% of African American and ∼4% of White MGI participants to lower activity metabolizer phenotypes, predicting higher risks of adverse drug reactions. We demonstrate that central PGx callsets created with existing tools and genetic data can be augmented by customized calls for challenging alleles based on structural variants to broaden the research potential and clinical utility of biobanks. These PGx callsets can be created in biobanks with existing array-based genotype data and highlight the utility of advanced computational methods in PGx allele identification.

Authors

  • Brett Vanderwerff
    Department of Biostatistics and Center for Statistical Genetics, University of Michigan School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA.
  • Amy L Pasternak
    Department of Clinical Pharmacy, University of Michigan College of Pharmacy, University of Michigan, Ann Arbor, MI 48109, USA.
  • Lars G Fritsche
    Department of Biostatistics and Center for Statistical Genetics, University of Michigan School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA.
  • Emily Bertucci-Richter
    Department of Biostatistics and Center for Statistical Genetics, University of Michigan School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA.
  • Snehal Patil
    Department of Biostatistics and Center for Statistical Genetics, University of Michigan School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA.
  • Michael Boehnke
    Department of Biostatistics and Center for Statistical Genetics, University of Michigan School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA.
  • Xiang Zhou
    Department of Sociology, Harvard University, Cambridge, Massachusetts, USA.
  • Sebastian Zöllner
    Department of Biostatistics and Center for Statistical Genetics, University of Michigan School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA.
  • Daniel L Hertz
    Department of Clinical Pharmacy, University of Michigan College of Pharmacy, University of Michigan, Ann Arbor, MI 48109, USA.
  • Matthew Zawistowski
    Department of Biostatistics and Center for Statistical Genetics, University of Michigan School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA.

Keywords

No keywords available for this article.