Graph embedding and unsupervised learning predict genomic sub-compartments from HiC chromatin interaction data.

Journal: Nature communications
Published Date:

Abstract

Chromatin interaction studies can reveal how the genome is organized into spatially confined sub-compartments in the nucleus. However, accurately identifying sub-compartments from chromatin interaction data remains a challenge in computational biology. Here, we present Sub-Compartment Identifier (SCI), an algorithm that uses graph embedding followed by unsupervised learning to predict sub-compartments using Hi-C chromatin interaction data. We find that the network topological centrality and clustering performance of SCI sub-compartment predictions are superior to those of hidden Markov model (HMM) sub-compartment predictions. Moreover, using orthogonal Chromatin Interaction Analysis by in-situ Paired-End Tag Sequencing (ChIA-PET) data, we confirmed that SCI sub-compartment prediction outperforms HMM. We show that SCI-predicted sub-compartments have distinct epigenetic marks, transcriptional activities, and transcription factor enrichment. Moreover, we present a deep neural network to predict sub-compartments using epigenome, replication timing, and sequence data. Our neural network predicts more accurate sub-compartment predictions when SCI-determined sub-compartments are used as labels for training.

Authors

  • Haitham Ashoor
    King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, Saudi Arabia.
  • Xiaowen Chen
    The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA.
  • Wojciech Rosikiewicz
    The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA.
  • Jiahui Wang
    School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing 102488, China.
  • Albert Cheng
    The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA.
  • Ping Wang
    School of Chemistry and Chemical Engineering, Shandong University of Technology, 255049, Zibo, PR China. Electronic address: wangping876@163.com.
  • Yijun Ruan
    The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA.
  • Sheng Li
    School of Data Science, University of Virginia, Charlottesville, VA, United States.