A multi-model machine learning framework for breast cancer risk stratification using clinical and imaging data.
Journal:
Journal of X-ray science and technology
Published Date:
Jan 27, 2025
Abstract
PurposeThis study presents a comprehensive machine learning framework for assessing breast cancer malignancy by integrating clinical features with imaging features derived from deep learning.MethodsThe dataset included 1668 patients with documented breast lesions, incorporating clinical data (e.g., age, BI-RADS category, lesion size, margins, and calcifications) alongside mammographic images processed using four CNN architectures: EfficientNet, ResNet, DenseNet, and InceptionNet. Three predictive configurations were developed: an imaging-only model, a hybrid model combining imaging and clinical data, and a stacking-based ensemble model that aggregates both data types to enhance predictive accuracy. Twelve feature selection techniques, including ReliefF and Fisher Score, were applied to identify key predictive features. Model performance was evaluated using accuracy and AUC, with 5-fold cross-valida tion and hyperparameter tuning to ensure robustness.ResultsThe imaging-only models demonstrated strong predictive performance, with EfficientNet achieving an AUC of 0.76. The hybrid model combining imaging and clinical data reached the highest accuracy of 83% and an AUC of 0.87, underscoring the benefits of data integration. The stacking-based ensemble model further optimized accuracy, reaching a peak AUC of 0.94, demonstrating its potential as a reliable tool for malignancy risk assessment.ConclusionThis study highlights the importance of integrating clinical and deep imaging features for breast cancer risk stratification, with the stacking-based model.