New hybrid features extracted from US images for breast cancer classification.
Journal:
Scientific reports
Published Date:
Jul 16, 2025
Abstract
Artificial intelligence (AI), and image processing fields play a vital role in classifying benign and malignant breast cancer (BC). The novelty of this paper lies in computing original hybrid features (HF) from textural and shape features of BC integrated into a polynomial regression, and their classification with two different Automated Machine Learning (AutoML). The obtained data are original; therefore, a previous analysis of them with violin graphs was needed. For computing of the hybrid features, the Haralick textural features and Hu moments were integrated in a polynomial regression way. In this context, two different AutoML, PyCaret and TPOT (Tree-based Pipeline Optimization Tool) were proposed, and the optimal model for hybrid features included in the classification process was identified during the tuning process. The experimental results indicated that the HF, composed of entropy and Hu moments, was selected by PyCaret using the AdaBoost Classifier (ADB) as the optimal classifier, achieving an accuracy of 91.4%. Additionally, TPOT employed a Multilayer Perceptron Classifier, which provided an accuracy of 90.6%. These findings identified the most effective features for classifying benign and malignant breast cancer (BC). Enhancing computational efficiency reduces the risk of overfitting; hence, the bagging, boosting, and stacking Ensemble Machine Learning (EML) techniques were proposed to validate the obtained results. The study's originality lies in the HF's capacity to accurately represent and capture the lesion's texture and shape, just like a physician makes a BC diagnosis.