ChromatinHD connects single-cell DNA accessibility and conformation to gene expression through scale-adaptive machine learning.

Journal: Nature communications
PMID:

Abstract

Gene regulation is inherently multiscale, but scale-adaptive machine learning methods that fully exploit this property in single-nucleus accessibility data are still lacking. Here, we develop ChromatinHD, a pair of scale-adaptive models that uses the raw accessibility data, without peak-calling or windows, to link regions to gene expression and determine differentially accessible chromatin. We show how ChromatinHD consistently outperforms existing peak and window-based approaches and find that this is due to a large number of uniquely captured, functional accessibility changes within and outside of putative cis-regulatory regions. Furthermore, ChromatinHD can delineate collaborating regulatory regions, including their preferential genomic conformations, that drive gene expression. Finally, our models also use changes in ATAC-seq fragment lengths to identify dense binding of transcription factors, a feature not captured by footprinting methods. Altogether, ChromatinHD, available at https://chromatinhd.org , is a suite of computational tools that enables a data-driven understanding of chromatin accessibility at various scales and how it relates to gene expression.

Authors

  • Wouter Saelens
    Laboratory of Systems Biology and Genetics, Institute of Bio-engineering and Global Health Institute, School of Life Sciences, Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland. wouter.saelens@ugent.be.
  • Olga Pushkarev
    Laboratory of Systems Biology and Genetics, Institute of Bio-engineering and Global Health Institute, School of Life Sciences, Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland.
  • Bart Deplancke
    Laboratory of Systems Biology and Genetics, Institute of Bioengineering, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015, Lausanne, Switzerland.