Deep learning-based computational approach for predicting ncRNAs-disease associations in metaplastic breast cancer diagnosis.
Journal:
BMC cancer
PMID:
40329245
Abstract
Non-coding RNAs (ncRNAs) play a crucial role in breast cancer progression, necessitating advanced computational approaches for precise disease classification. This study introduces a Deep Reinforcement Learning (DRL)-based framework for predicting ncRNA-disease associations in metaplastic breast cancer (MBC) using a multi-dimensional descriptor system (ncRNADS) integrating 550 sequence-based features and 1,150 target gene descriptors (miRDB score ≥ 90). The model achieved 96.20% accuracy, 96.48% precision, 96.10% recall, and a 96.29% F1-score, outperforming traditional classifiers such as support vector machines (SVM) and neural networks. Feature selection and optimization reduced dimensionality by 42.5% (4,430 to 2,545 features) while maintaining high accuracy, demonstrating computational efficiency. External validation confirmed model specificity to breast cancer subtypes (87-96.5% accuracy) and minimal cross-reactivity with unrelated diseases like Alzheimer's (8-9% accuracy), ensuring robustness. SHAP analysis identified key sequence motifs (e.g., "UUG") and structural free energy (ΔG = - 12.3 kcal/mol) as critical predictors, validated by PCA (82% variance) and t-SNE clustering. Survival analysis using TCGA data revealed prognostic significance for MALAT1, HOTAIR, and NEAT1 (associated with poor survival, HR = 1.76-2.71) and GAS5 (protective effect, HR = 0.60). The DRL model demonstrated rapid training (0.08 s/epoch) and cloud deployment compatibility, underscoring its scalability for large-scale applications. These findings establish ncRNA-driven classification as a cornerstone for precision oncology, enabling patient stratification, survival prediction, and therapeutic target identification in MBC.