Identification of active transcriptional regulatory elements from GRO-seq data.

Journal: Nature methods
PMID:

Abstract

Modifications to the global run-on and sequencing (GRO-seq) protocol that enrich for 5'-capped RNAs can be used to reveal active transcriptional regulatory elements (TREs) with high accuracy. Here, we introduce discriminative regulatory-element detection from GRO-seq (dREG), a sensitive machine learning method that uses support vector regression to identify active TREs from GRO-seq data without requiring cap-based enrichment (https://github.com/Danko-Lab/dREG/). This approach allows TREs to be assayed together with gene expression levels and other transcriptional features in a single experiment. Predicted TREs are more enriched for several marks of transcriptional activation—including expression quantitative trait loci, disease-associated polymorphisms, acetylated histone 3 lysine 27 (H3K27ac) and transcription factor binding—than those identified by alternative functional assays. Using dREG, we surveyed TREs in eight human cell types and provide new insights into global patterns of TRE function.

Authors

  • Charles G Danko
    1] Baker Institute for Animal Health, Cornell University, Ithaca, New York, USA. [2] Department of Biomedical Sciences, Cornell University, Ithaca, New York, USA. [3] Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, USA.
  • Stephanie L Hyland
    Tri-Institutional Training Program in Computational Biology and Medicine, New York, New York, USA.
  • Leighton J Core
    Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, USA.
  • Andre L Martins
    Graduate Field in Computational Biology, Cornell University, Ithaca, New York, USA.
  • Colin T Waters
    Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, USA.
  • Hyung Won Lee
    Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, USA.
  • Vivian G Cheung
    1] Life Sciences Institute, University of Michigan, Ann Arbor, Michigan, USA. [2] Howard Hughes Medical Institute, Chevy Chase, Maryland, USA.
  • W Lee Kraus
    1] Laboratory of Signaling and Gene Regulation, Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, Texas, USA. [2] Division of Basic Research, Department of Obstetrics and Gynecology, University of Texas Southwestern Medical Center, Dallas, Texas, USA.
  • John T Lis
    Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, USA.
  • Adam Siepel
    Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, USA.