Hybrid CNN and Multi-Head Attention Model for Analyzing Epigenetic Mechanisms and Gene Expression Across Fungal Phylogenetic Distances
Journal:
bioRxiv
Published Date:
Apr 30, 2026
Abstract
Understanding gene expression is crucial for optimizing biological processes in bioeconomic processes, human health, and environmental regulation. Epigenetic modifications significantly influence gene expression by altering chromatin structure and DNA accessibility. However, knowledge about the conservation of these mechanisms across species, especially in non-model organisms, is limited. This study predicts gene expression levels based on epigenetic modifications across fungal species, facilitating knowledge transfer from well-characterized to less understood species. We developed a deep learning model, MAPLE (Model predictions Across Phylogenetic distances by Learning Expression from Epigenetics), which integrates convolutional layers and multi-head attention to capture dependencies in epigenetic data. MAPLE shows strong cross-species performance in fungi, achieving up to 80% accuracy and 89% AUROC for intra-species validation, and 77% accuracy and 83% AUROC in cross-species tasks, outperforming benchmarks. SHAP analysis reveals key epigenetic features driving gene expression, providing insights for future experimental design. Our findings highlight MAPLE's potential to generalize across fungal species, offering a versatile tool for optimizing gene expression.