Transfer learning enables prediction of CYP2D6 haplotype function.

Journal: PLoS computational biology
PMID:

Abstract

Cytochrome P450 2D6 (CYP2D6) is a highly polymorphic gene whose protein product metabolizes more than 20% of clinically used drugs. Genetic variations in CYP2D6 are responsible for interindividual heterogeneity in drug response that can lead to drug toxicity and ineffective treatment, making CYP2D6 one of the most important pharmacogenes. Prediction of CYP2D6 phenotype relies on curation of literature-derived functional studies to assign a functional status to CYP2D6 haplotypes. As the number of large-scale sequencing efforts grows, new haplotypes continue to be discovered, and assignment of function is challenging to maintain. To address this challenge, we have trained a convolutional neural network to predict functional status of CYP2D6 haplotypes, called Hubble.2D6. Hubble.2D6 predicts haplotype function from sequence data and was trained using two pre-training steps with a combination of real and simulated data. We find that Hubble.2D6 predicts CYP2D6 haplotype functional status with 88% accuracy in a held-out test set and explains 47.5% of the variance in in vitro functional data among star alleles with unknown function. Hubble.2D6 may be a useful tool for assigning function to haplotypes with uncurated function, and used for screening individuals who are at risk of being poor metabolizers.

Authors

  • Gregory McInnes
    Biomedical Informatics Training Program, Stanford University, Stanford, California, United States of America.
  • Rachel Dalton
    Department of Biomedical and Pharmaceutical Sciences, University of Montana, Missoula, Montana, United States of America.
  • Katrin Sangkuhl
    Department of Biomedical Data Science, Stanford University, Stanford, California, USA.
  • Michelle Whirl-Carrillo
    Department of Biomedical Data Science, Stanford University, Stanford, California, USA.
  • Seung-Been Lee
    Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America.
  • Philip S Tsao
    VA Palo Alto Epidemiology Research and Information Center for Genomics, VAPAHCS, Palo Alto, California, United States of America.
  • Andrea Gaedigk
    Division of Clinical Pharmacology, Toxicology, & Therapeutic Innovation, Children's Mercy Kansas City, Kansas City, Missouri, USA.
  • Russ B Altman
    Departments of Medicine, Genetics and Bioengineering, Stanford University, Stanford, California, United States of America.
  • Erica L Woodahl
    Department of Biomedical and Pharmaceutical Sciences, University of Montana, Missoula, Montana, United States of America.