Comparative Analysis of Feature Extraction Methods and Machine Learning Models for Predicting Osteoporosis Prevalence.

Journal: Journal of medical systems

Published Date: May 29, 2025

Abstract

This study systematically examined the impact of three feature selection techniques (Boruta, Extreme gradient boosting (XGBoost), and Lasso) for optimizing four machine learning models (Random forest (RF), XGBoost, Logistic regression (LR), and Support vector machine (SVM)) in predicting bone density prevalence. Our findings revealed that varying data partitioning ratios (training and test sets: 0.6:0.4; 0.7:0.3; 0.8:0.2; 0.9:0.1) minimally impacted the prediction accuracy across all four models, a conclusion reinforced by 10-fold cross validation. Besides, principal component analysis (PCA) led to substantial accuracy degradation (0.6-0.8 range), suggesting incompatibility with this study's requirements due to the inherent complex decision boundaries in the original high-dimensional data. Comparative analysis demonstrated that the Boruta-XGBoost combination achieved superior performance (accuracy: 0.9083 ± 0.0146), significantly outperforming the Lasso-LR combination (0.7480 ± 0.0157) across all evaluation frameworks. Regarding model evaluation metrics, the RF model exhibited enhanced discriminative capacity with Area under the receiver operating characteristic (AUROC) values of 0.85, 0.81, and 0.80 under different feature selection approaches, surpassing the SVM model (0.78, 0.76, and 0.76). This advantage likely stems from RF's native capability to capture non-linear relationships and feature interactions.

Authors

Danni Zhang

Department of Functional, Shaoxing Hospital of Traditional Chinese Medicine, Shaoxing, 312000, Zhejiang, China.
Xingyu Yang

School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, United States of America.
Fangying Wang

Department of Functional, Shaoxing Hospital of Traditional Chinese Medicine, Shaoxing, 312000, Zhejiang, China.
Cifang Qiu

Department of Functional, Shaoxing Hospital of Traditional Chinese Medicine, Shaoxing, 312000, Zhejiang, China. 3176894512@qq.com.
Yanfu Chai

School of Mechanical and Electrical Engineering, Shaoxing University, Shaoxing, 312000, China. 1073330140@qq.com.
Danruo Fang

Key Laboratory of Respiratory Disease of Zhejiang Province, Department of Respiratory and Critical Care Medicine, Second Affiliated Hospital of Zhejiang University School of Medicine, 88# Jiefang Road, Hangzhou, 310009, Zhejiang, China. fangdanruo123@163.com.

Keywords

Bone Density Female Humans Logistic Models Machine Learning Osteoporosis Prevalence Principal Component Analysis ROC Curve Support Vector Machine

External Resources

View on PubMed Access via DOI PubMed (40439990)

Comparative Analysis of Feature Extraction Methods and Machine Learning Models for Predicting Osteoporosis Prevalence.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals