CMImpute: cross-species and tissue imputation of species-level DNA methylation samples across mammalian species.

Journal: Genome biology
Published Date:

Abstract

The large-scale application of the mammalian methylation array has substantially expanded the availability of DNA methylation data in mammalian species. However, this data captures only a small portion of species-tissue combinations. To address this, we develop CMImpute (Cross-species Methylation Imputation), a method based on a conditional variational autoencoder, to impute DNA methylation representing species-tissue combinations. We demonstrate that CMImpute achieves strong sample-wise correlation between imputed and observed values. Using CMImpute and data from 348 species and 59 tissue types, we impute methylation data for 19,786 new species-tissue combinations. We expect CMImpute will be a useful resource for DNA methylation analyses.

Authors

  • Emily Maciejewski
    Computer Science Department, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
  • Steve Horvath
    Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA 90095, USA.
  • Jason Ernst
    Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA, 90095, USA. jason.ernst@ucla.edu.