Advancing breast cancer prediction: Comparative analysis of ML models and deep learning-based multi-model ensembles on original and synthetic datasets.

Journal: PloS one

Published Date: Jun 18, 2025

Abstract

Breast cancer is a significant global health concern with rising incidence and mortality rates. Current diagnostic methods face challenges, necessitating improved approaches. This study employs various machine learning (ML) algorithms, including KNN, SVM, ANN, RF, XGBoost, ensemble models, AutoML, and deep learning (DL) techniques, to enhance breast cancer diagnosis. The objective is to compare the efficiency and accuracy of these models using original and synthetic datasets, contributing to the advancement of breast cancer diagnosis. The methodology comprises three phases, each with two stages. In the first stage of each phase, stratified K-fold cross-validation was performed to train and evaluate multiple ML models. The second stage involved DL-based and AutoML-based ensemble strategies to improve prediction accuracy. In the second and third phases, synthetic data generation methods, such as Gaussian Copula and TVAE, were utilized. The KNN model outperformed others on the original dataset, while the AutoML approach using H2OXGBoost using synthetic data also showed high accuracy. These findings underscore the effectiveness of traditional ML models and AutoML in predicting breast cancer. Additionally, the study demonstrated the potential of synthetic data generation methods to improve prediction performance, aiding decision-making in the diagnosis and treatment of breast cancer.

Authors

Kazi Arman Ahmed

Department of Industrial and Production Engineering, Military Institute of Science and Technology, Dhaka, Bangladesh.
Israt Humaira

Department of Biomedical Engineering, Military Institute of Science and Technology, Dhaka, Bangladesh.
Ashiqur Rahman Khan

Department of Industrial and Production Engineering, Bangladesh University of Engineering and Technology, Dhaka, Bangladesh.
Md Shamim Hasan

Department of Industrial and Production Engineering, Military Institute of Science and Technology, Dhaka, Bangladesh.
Mukitul Islam

Department of Industrial and Production Engineering, Military Institute of Science and Technology, Dhaka, Bangladesh.
Anik Roy

Department of Industrial and Production Engineering, Military Institute of Science and Technology, Dhaka, Bangladesh.
Mehrab Karim

Department of Industrial and Production Engineering, Military Institute of Science and Technology, Dhaka, Bangladesh.
Mezbah Uddin

Department of Industrial and Production Engineering, Military Institute of Science and Technology, Dhaka, Bangladesh.
Ashique Mohammad

Department of Computer Science and Engineering, University of Dhaka, Bangladesh.
Md Doulotuzzaman Xames

Department of Industrial and Production Engineering, Military Institute of Science and Technology, Dhaka, Bangladesh.

Keywords

Algorithms Breast Neoplasms Deep Learning Female Humans Machine Learning Support Vector Machine

External Resources

View on PubMed Access via DOI PubMed (40531928)

Advancing breast cancer prediction: Comparative analysis of ML models and deep learning-based multi-model ensembles on original and synthetic datasets.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals

Advancing breast cancer prediction: Comparative analysis of ML models and deep learning-based multi-model ensembles on original and synthetic datasets.

Abstract

Authors

Keywords

External Resources

Don't Miss the Future of Medicine

Popular Topics

Recent Journals