HAETAE: A highly accurate and efficient epigenome transformer for tissue-specific histone modification prediction
Journal:
bioRxiv
Published Date:
Mar 11, 2026
Abstract
While genomic models trained on four bases often fail to capture cell-type specificity, we introduce HAETAE, which integrates 5-methylcytosine from long-read sequencing into a 5-base framework. By explicitly modeling epigenetic context, HAETAE achieves state-of-the-art accuracy (>0.95) with orders of magnitude fewer parameters, challenging the prevailing scaling-law paradigm. Furthermore, HAETAE deciphers tissue-specific regulatory logic, as demonstrated by revealing the distinct, context-dependent functional impact of the TERT promoter mutation across diverse tissues.