Systematic analysis of binding of transcription factors to noncoding variants.

Journal: Nature
Published Date:

Abstract

Many sequence variants have been linked to complex human traits and diseases, but deciphering their biological functions remains challenging, as most of them reside in noncoding DNA. Here we have systematically assessed the binding of 270 human transcription factors to 95,886 noncoding variants in the human genome using an ultra-high-throughput multiplex protein-DNA binding assay, termed single-nucleotide polymorphism evaluation by systematic evolution of ligands by exponential enrichment (SNP-SELEX). The resulting 828 million measurements of transcription factor-DNA interactions enable estimation of the relative affinity of these transcription factors to each variant in vitro and evaluation of the current methods to predict the effects of noncoding variants on transcription factor binding. We show that the position weight matrices of most transcription factors lack sufficient predictive power, whereas the support vector machine combined with the gapped k-mer representation show much improved performance, when assessed on results from independent SNP-SELEX experiments involving a new set of 61,020 sequence variants. We report highly predictive models for 94 human transcription factors and demonstrate their utility in genome-wide association studies and understanding of the molecular pathways involved in diverse human traits and diseases.

Authors

  • Jian Yan
    School of Medicine, Northwest University, Xi'an, China. jian.yan@cityu.edu.hk.
  • Yunjiang Qiu
    Ludwig Institute for Cancer Research, La Jolla, CA, USA.
  • AndrĂ© M Ribeiro Dos Santos
    Ludwig Institute for Cancer Research, La Jolla, CA, USA.
  • Yimeng Yin
    Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Solna, Sweden.
  • Yang E Li
    Ludwig Institute for Cancer Research, La Jolla, CA, USA.
  • Nick Vinckier
    Department of Pediatrics, University of California San Diego, La Jolla, CA, USA.
  • Naoki Nariai
    Department of Pediatrics, University of California San Diego, La Jolla, CA, USA.
  • Paola Benaglio
    Department of Pediatrics, University of California San Diego, La Jolla, CA, USA.
  • Anugraha Raman
    Ludwig Institute for Cancer Research, La Jolla, CA, USA.
  • Xiaoyu Li
    Department of Gastroenterology, The Affiliated Hospital of Qingdao University, Qingdao, China.
  • Shicai Fan
    School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, Sichuan, China. shicaifan@uestc.edu.cn.
  • Joshua Chiou
    Department of Pediatrics, University of California San Diego, La Jolla, CA, USA.
  • Fulin Chen
    School of Medicine, Northwest University, Xi'an, China.
  • Kelly A Frazer
    Department of Pediatrics, University of California San Diego, La Jolla, CA, USA.
  • Kyle J Gaulton
    Department of Pediatrics, University of California San Diego, La Jolla, CA, USA.
  • Maike Sander
    Department of Cellular and Molecular Medicine, University of California San Diego, La Jolla, CA, USA.
  • Jussi Taipale
    Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Solna, Sweden. ajt208@cam.ac.uk.
  • Bing Ren
    Ludwig Institute for Cancer Research, La Jolla, CA, 92093, USA.