Exploring graph-based models for predicting active compounds against triple-negative breast cancer.
Journal:
Molecular diversity
Published Date:
Jul 9, 2025
Abstract
Breast cancer is among the most dominant and rapidly rising cancers, both in India and around the world. Triple-negative breast cancer (TNBC) is one of the most aggressive subtypes of breast cancer, distinguished by the absence of HER2, progesterone, and estrogen receptor expressions. This absence limits treatment options, emphasizing the urgent need to discover or design new drug candidates for TNBC. Integrating artificial intelligence and machine learning in computational modeling, has significantly accelerated the analysis of large-scale biological data and improved the prediction of therapeutic outcomes. In this study, we curated a data set of 756 mutant-type compounds from three cell lines and developed four graph-based models to predict active compounds against TNBC. Validated using stratified nested tenfold cross-validation and optimized with the Optuna framework, the models achieved predictive accuracy with AUC values of 0.65-0.82, with the MPNN model outperforming all the others. Furthermore, key structural fragments associated with cell inhibition and model predictions were identified and interpreted using several explainability techniques. Validation with an external set of FDA-approved drugs demonstrated prediction accuracies ranging from 66% to 97%, highlighting the robustness of the models in identifying compounds with potential inhibitory activity against TNBC cells.
Authors
Keywords
No keywords available for this article.