DIRECTION: a machine learning framework for predicting and characterizing DNA methylation and hydroxymethylation in mammalian genomes.

Journal: Bioinformatics (Oxford, England)
Published Date:

Abstract

MOTIVATION: 5-Methylcytosine and 5-Hydroxymethylcytosine in DNA are major epigenetic modifications known to significantly alter mammalian gene expression. High-throughput assays to detect these modifications are expensive, labor-intensive, unfeasible in some contexts and leave a portion of the genome unqueried. Hence, we devised a novel, supervised, integrative learning framework to perform whole-genome methylation and hydroxymethylation predictions in CpG dinucleotides. Our framework can also perform imputation of missing or low quality data in existing sequencing datasets. Additionally, we developed infrastructure to perform in silico, high-throughput hypotheses testing on such predicted methylation or hydroxymethylation maps.

Authors

  • Milos Pavlovic
    Department of Biological Sciences, Center for Systems Biology.
  • Pradipta Ray
    Department of Biological Sciences, Center for Systems Biology.
  • Kristina Pavlovic
    Department of Biological Sciences, Center for Systems Biology.
  • Aaron Kotamarti
    Department of Biological Sciences, Center for Systems Biology.
  • Min Chen
    School of Computer Science and TechnologyHuazhong University of Science and Technology Wuhan 430074 China.
  • Michael Q Zhang
    Department of Biological Sciences, Center for Systems Biology.