Machine learning analysis of survival outcomes in breast cancer patients treated with chemotherapy, hormone therapy, surgery, and radiotherapy.

Journal: Scientific reports
Published Date:

Abstract

Breast cancer continues to be a leading cause of death among women in the world. The prediction of survival outcomes based on treatment modalities, i.e., chemotherapy, hormone therapy, surgery, and radiation therapy is an essential step towards personalization in treatment planning. However, Machine Learning (ML) models may improve these predictions by investigating intricate relationships between clinical variables and survival. This study investigates the performance of several ML models to predict survival rate in patients undergoing diverse breast cancer treatments i.e., chemotherapy, hormone therapy, surgery and radiation using multiple clinical parameters. The dataset consisted of 5000 samples and turned into downloaded from Kaggle. The models assessed blanketed Support Vector Machines (SVM), K-Nearest Neighbor (KNN), AdaBoost, Gradient Boosting, Random Forest, Gaussian Naive Bayes, Logistic Regression, Extreme Gradient Boosting (XG boost), and Decision tree. Performance of the models was assessed using parameters such as Accuracy, Precision, Recall, F1-Score and Area under the Receiver Operating Characteristic Curve (AUC-ROC). SHAP (SHapley Additive exPlanations) values analysis was done to provide an explanation for the impact of a feature on model predictions using Waterfall and Beeswarm plots. Anticipated baseline (E(f(x))) were in comparison to the predictions (f(x)) for each therapy group. Performance of Gradient Boosting was shown to be the best with an Accuracy: 0.972, Precision: 0.973, Recall: 0.972, F1-Score: 0.973, and AUC-ROC Score: 0.997. Chemotherapy had a notably bad impact on survival, with an f(x) of -0.274 and an E(f(x)) of -0.025. Hormone therapy showed the maximum detrimental effect on survival, with an f(x) of -0.408. Surgical operation had an extraordinarily impartial impact (f(x) = -0.041), even as radiation therapy positively impacted survival results with an f(x) of 0.22. Gradient Boosting was the most predictive algorithm for breast cancer survival outcomes. This SHAP-primarily based analysis provides a complete knowledge of ways one-of-a-kind treatments have an effect on survival predictions in breast cancer patients. Radiation therapy indicates the most tremendous effect on survival, whilst hormone therapy reveals the maximum poor effect. Future studies need to explore personalized treatment strategies that comprise these insights to enhance patient effects.

Authors

  • Eyachew Misganew Tegaw
    Department of Physics, College of Natural and Computational Sciences, Debre Tabor University, Debre Tabor, Ethiopia. eyachew2003@gmail.com.
  • Betelhem Bizuneh Asfaw
    Department of Health System Management and Health Economics, School of Public Health, College of Medicine and Health Sciences, Bahir Dar University, Bahir Dar, Ethiopia.