Modeling islet enhancers using deep learning identifies candidate causal variants at loci associated with T2D and glycemic traits.

Journal: Proceedings of the National Academy of Sciences of the United States of America
PMID:

Abstract

Genetic association studies have identified hundreds of independent signals associated with type 2 diabetes (T2D) and related traits. Despite these successes, the identification of specific causal variants underlying a genetic association signal remains challenging. In this study, we describe a deep learning (DL) method to analyze the impact of sequence variants on enhancers. Focusing on pancreatic islets, a T2D relevant tissue, we show that our model learns islet-specific transcription factor (TF) regulatory patterns and can be used to prioritize candidate causal variants. At 101 genetic signals associated with T2D and related glycemic traits where multiple variants occur in linkage disequilibrium, our method nominates a single causal variant for each association signal, including three variants previously shown to alter reporter activity in islet-relevant cell types. For another signal associated with blood glucose levels, we biochemically test all candidate causal variants from statistical fine-mapping using a pancreatic islet beta cell line and show biochemical evidence of allelic effects on TF binding for the model-prioritized variant. To aid in future research, we publicly distribute our model and islet enhancer perturbation scores across ~67 million genetic variants. We anticipate that DL methods like the one presented in this study will enhance the prioritization of candidate causal variants for functional studies.

Authors

  • Sanjarbek Hudaiberdiev
    Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD 20892.
  • D Leland Taylor
    Center for Precision Health Research, National Human Genome Research Institute, NIH, Bethesda, MD 20892.
  • Wei Song
    School of Pharmaceutical Science, Jiangnan University, Wuxi, 214122, Jiangsu, China.
  • Narisu Narisu
    National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, 20892, USA.
  • Redwan M Bhuiyan
    The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032.
  • Henry J Taylor
    Center for Precision Health Research, National Human Genome Research Institute, NIH, Bethesda, MD 20892.
  • Xuming Tang
    Department of Surgery, Weill Cornell Medicine, New York, NY 10065.
  • Tingfen Yan
    Center for Precision Health Research, National Human Genome Research Institute, NIH, Bethesda, MD 20892.
  • Amy J Swift
    Center for Precision Health Research, National Human Genome Research Institute, NIH, Bethesda, MD 20892.
  • Lori L Bonnycastle
    Center for Precision Health Research, National Human Genome Research Institute, NIH, Bethesda, MD 20892.
  • Diamante Consortium
  • Shuibing Chen
    Department of Surgery, Weill Cornell Medicine, New York, NY 10065.
  • Michael L Stitzel
    The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA.
  • Michael R Erdos
    National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, 20892, USA.
  • Ivan Ovcharenko
    Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD 20892.
  • Francis S Collins
    National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, 20892, USA. Electronic address: collinsf@od.nih.gov.