A multi-scale neighbor topology guided transformer and Kolmogorov-Arnold network enhanced feature learning model for disease-related circRNA prediction.

Journal: IEEE journal of biomedical and health informatics
Published Date:

Abstract

As circular non-coding RNA (circRNA) is closely associated with various human diseases, identifying disease-related circRNAs can provide a deeper understanding of the mechanisms underlying disease pathogenesis. Advanced circRNA-disease association prediction methods mainly focus on graph learning techniques such as graph convolutional networks and graph attention networks. However, these methods do not fully encode the multi-scale neighbor topologies of each node, and the dependencies among the pairwise attributes. We propose a multi-scale neighbor topology-guided transformer with Kolmogorov-Arnold network (KAN) enhanced feature learning for circRNA and disease association prediction, termed MKCD. The model integrates multi-scale neighbor topology, complex relationships among multiple nodes, and the global and local dependencies of pairwise attributes. First, MKCD incorporates an adaptive multi-scale neighbor topology embedding construction strategy (AMNE), which generates neighbor topologies covering varying scopes of neighbors by performing random walks on a circRNA-disease-miRNA heterogeneous graph. Second, we design a dynamic multi-scale neighbor topology-guided transformer (DMTT) that leverages the multi-scale neighbor topologies to guide the learning of relationships among circRNA, miRNA, and disease nodes. The multi-scale neighbor topology is dynamically evolved, providing adaptive guidance to the transformer's learning process. Third, we establish a feature-gated network (FGN) to evaluate the importance of topological features derived from DMTT and the original node attributes. Finally, we propose an adaptive joint convolutional neural networks and KAN learning strategy (ACK) to learn the global and local dependencies of circRNA and disease node pair features. Comprehensive comparison experiments show that MKCD outperforms six state-of-the-art methods, improving AUC and AUPR by at least 14.1% and 7.6%, respectively. Ablation experiments further validate the effectiveness of AMNE, DMTT, FGN and ACK innovations. Case studies on three diseases further validate the application value of our method in discovering reliable circRNA candidates for diseases of focus. The source code and datasets are freely available at https://github.com/pingxuan-hlju/MKCD.

Authors

  • Ping Xuan
    School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China.
  • Haoyuan Li
    School of Computer Science, Yangtze University, Jingzhou, Hubei Province, China.
  • Hui Cui
    Shanghai Center for Bioinformation Technology, Shanghai Academy of Science and Technology, 1278 Keyuan Road, Shanghai 201203, PR China; School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, PR China.
  • Zelong Xu
    School of Bioinformatics Sciences and Technology, Harbin Medical University, Harbin 150081, P.R.China.
  • Toshiya Nakaguchi
  • Tiangang Zhang
    School of Mathematical Science, Heilongjiang University, Harbin 150080, China. zhang@hlju.edu.cn.

Keywords

No keywords available for this article.