Deciphering lncRNA-disease associations based on multi-representation fusion and boosting with Gaussian process.

Journal: IEEE journal of biomedical and health informatics
Published Date:

Abstract

LncRNA-disease association (LDA) identification can provide valuable insights for understanding disease pathogenesis. Existing most deep learning-based LDA prediction models remain limitations in effectively fusing various features of lncRNAs and diseases and accurately classifying unknown lncRNA-disease pairs (LDPs). Here, we introduce a deep learning-based LDA prediction frame work named LDA-RMGPB based on multi-representation fusion and boosting with Gaussian process. First, a randomized singular value decomposition model is presented to extract LDP linear features. Subsequently, a masked graph autoencoder is exploited to learn LDP nonlinear features. Finally, a boosting algorithm with Gaussian process takes the concatenation of LDP linear and nonlinear features as inputs and classifies unlabeled LDPs. To measure the LDA-RMGPB performance, we performed a series of experiments. Using six evaluation metrics, under four different 5-fold cross-validation strategies (i.e., cross validations on lncRNAs, diseases, LDPs, independent lncRNAs and inde pendent diseases), LDA-RMGPB greatly surpassed seven state-of-the-art prediction methods on two LDA datasets. Further analysis, including ablation study, CeRNA theory analysis, lncRNA-related therapeutic drug analysis, and survival analysis, elucidated that LDA-RMGPB achieved superior LDA identification ability. Moreover, we predicted that lncRNAs ATP6V1G2-DDX39B and PSORS1C3 could have dense linkages with breast cancer and prostatic neoplasms, respectively. We anticipate that LDA-RMGPB contributes to the discovery of novel therapeutic molecular targets across diverse diseases. LDA-RMGPB is freely available at https://github.com/plhhnu/LDA-RMGPB.

Authors

Keywords

No keywords available for this article.