Interpretable predictive models of genome-wide aryl hydrocarbon receptor-DNA binding reveal tissue-specific binding determinants.

Journal: Toxicological sciences : an official journal of the Society of Toxicology
PMID:

Abstract

The aryl hydrocarbon receptor (AhR) is an inducible transcription factor whose ligands include the potent environmental contaminant 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD). Ligand-activated AhR binds to DNA at dioxin response elements (DREs) containing the core motif 5'-GCGTG-3'. However, AhR binding is highly tissue specific. Most DREs in accessible chromatin are not bound by TCDD-activated AhR, and DREs accessible in multiple tissues can be bound in some and unbound in others. As such, AhR functions similarly to many nuclear receptors. Given that AhR possesses a strong core motif, it is suited for a motif-centered analysis of its binding. We developed interpretable machine learning models predicting the AhR binding status of DREs in MCF-7, GM17212, and HepG2 cells, as well as primary human hepatocytes. Cross-tissue models predicting transcription factor (TF)-DNA binding generally perform poorly. However, reasons for the low performance remain unexplored. By interpreting the results of individual within-tissue models and by examining the features leading to low cross-tissue performance, we identified sequence and chromatin context patterns correlated with AhR binding. We conclude that AhR binding is driven by a complex interplay of tissue-agnostic DRE flanking DNA sequence and tissue-specific local chromatin context. Additionally, we demonstrate that interpretable machine learning models can provide novel and experimentally testable mechanistic insights into DNA binding by inducible TFs.

Authors

  • David Filipovic
    Department of Biomedical Engineering, Michigan State University, East Lansing, Michigan 48824, USA.
  • Wenjie Qi
    Department of Biomedical Engineering, Michigan State University, East Lansing, Michigan 48824, USA.
  • Omar Kana
    Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, 70803, USA.
  • Daniel Marri
    Department of Biomedical Engineering, Michigan State University, East Lansing, Michigan 48824, USA.
  • Edward L LeCluyse
    LifeSciences Division, LifeNet Health, Research Triangle Park, North Carolina 27709, USA.
  • Melvin E Andersen
    ScitoVation LLC, Durham, North Carolina 27713, USA.
  • Suresh Cuddapah
    Division of Environmental Medicine, Department of Medicine, New York University School of Medicine, New York, New York 10010, USA.
  • Sudin Bhattacharya
    Department of Biomedical Engineering, Michigan State University, East Lansing, Michigan 48824, USA.