Diff-SE: A Diffusion-Augmented Contrastive Learning Framework for Super-Enhancer Prediction.

Journal: Journal of chemical information and modeling
Published Date:

Abstract

Super-enhancers (SEs) are cis-regulatory elements that play crucial roles in gene expression and are implicated in diseases such as cancer and Alzheimer's. Traditional identification methods rely on ChIP-seq experiments, which are costly and time-consuming. While recent computational approaches have leveraged sequence features for SE prediction, they often suffer from severe class imbalance and poor generalization across species. To address these limitations, we propose Diff-SE, a deep learning framework that integrates diffusion-based data augmentation with contrastive learning. The diffusion module models the continuous distribution of SEs to generate biologically meaningful synthetic positive samples, effectively balancing training data. A contrastive learning strategy is then used to enhance feature representation by maximizing intraclass similarity and interclass separation. Experimental results across eight data sets demonstrate that Diff-SE consistently outperforms the baseline model, achieving 10%-30% improvements in precision (PRE), Matthews correlation coefficient (MCC), and 1-score. Furthermore, Diff-SE exhibits superior generalization in cross-species validation between human and mouse cell lines. The code and data sets are available at https://github.com/15831959673/Diff-SE, enabling further research and applications in SE prediction.

Authors

  • Haolu Zhou
    School of Artificial Intelligence, Hebei University of Technology, Tianjin 300400, China.
  • Yu Han
    Department of Neurology, The First Affiliated Hospital, Dalian Medical University, Dalian, China.
  • Yude Bai
    School of Software, Tiangong University, Tianjin 300387, China.
  • Yun Zuo
    Department of Mathematics, Dalian Maritime University, No. 1 Linghai Road, Dalian 116026, China.
  • Wenying He
    School of Computer Science and Technology, Tianjin University, Tianjin, China.
  • Fei Guo
    School of Electronic Information Engineering, Tianjin University, Tianjin 300072, China. Electronic address: gfjy001@yahoo.com.