Current genomic deep learning models display decreased performance in cell type-specific accessible regions.

Journal: Genome biology
PMID:

Abstract

BACKGROUND: A number of deep learning models have been developed to predict epigenetic features such as chromatin accessibility from DNA sequence. Model evaluations commonly report performance genome-wide; however, cis regulatory elements (CREs), which play critical roles in gene regulation, make up only a small fraction of the genome. Furthermore, cell type-specific CREs contain a large proportion of complex disease heritability.

Authors

  • Pooja Kathail
    Center for Computational Biology, University of California Berkeley, Berkeley, CA, USA.
  • Richard W Shuai
    Department of Electrical Engineering and Computer Sciences, University of California Berkeley, Berkeley, CA, USA.
  • Ryan Chung
    Department of Radiology, NYU Langone Health, New York, New York.
  • Chun Jimmie Ye
    Division of Rheumatology, Department of Medicine, University of California, San Francisco, CA, USA.
  • Gabriel B Loeb
    Division of Nephrology, Department of Medicine, University of California, San Francisco, CA, USA. gabriel.loeb@ucsf.edu.
  • Nilah M Ioannidis
    Department of Electrical Engineering and Computer Sciences, University of California Berkeley, Berkeley, CA, USA. nilah@berkeley.edu.