Enhancing osteoporosis risk prediction using machine learning: A holistic approach integrating biomarkers and clinical data.

Journal: Computers in biology and medicine
Published Date:

Abstract

Osteoporosis (OP) affects approximately 18 % of the global population, with osteoporosis-associated fractures impacting up to 37 million people annually. While dual-energy X-ray absorptiometry (DXA) remains the gold standard for diagnosis, its limitations, including restricted availability and radiation exposure, highlight the need for alternative screening methods. We developed a machine learning model to predict OP risk using routinely collected clinical data, deliberately excluding DXA measurements to ensure broad accessibility. Using data from NHANES cycles 2007-2014, we analyzed 7924 participants aged 50 years and older, identifying 1636 OP cases (20.6 %) and 6288 normal cases (79.4 %) through comprehensive criteria incorporating both WHO densitometric standards (T-scores ≤ -2.5) and anthropometric risk factors. We implemented a stacking ensemble model combining four specialized classifiers (Gradient Boosting, Random Forest, XGBoost, and LightGBM) with a logistic regression meta-classifier. The model achieved 93 % accuracy, an AUC of 0.94, and demonstrated robust performance through cross-validation (mean score: 0.929 ± 0.030). feature importance analysis revealed age (6.04 %), arm muscle circumference (5.61 %), and body weight (5.30 %) as the most influential predictors, followed by gender (3.28 %), BMI (2.71 %), and calcium intake (2.42 %). Additional significant predictors included folate (2.28 %), height (2.23 %), hand grip strength (2.21 %), and alkaline phosphatase (2.16 %). These biologically plausible relationships align with established clinical knowledge of OP risk factors. The model's strong performance metrics and reliance on readily available clinical data suggest its potential as a practical screening tool, particularly in settings with limited DXA access. All code and implementation details are openly available on GitHub, facilitating integration into existing healthcare systems. This approach offers a promising pathway for enhancing early OP detection and risk assessment across diverse healthcare settings.

Authors

  • Filipe Ricardo Carvalho
    Faculty of Medicine and Biomedical Sciences, University of Algarve, Faro, Portugal; Centre of Marine Sciences (CCMAR/CIMAR LA), University of Algarve, Faro, Portugal; University of Algarve - Campus de Gambelas, Faro 8005-139, Portugal. Electronic address: frcarvalho@ualg.pt.
  • Paulo Jorge Gavaia
    Faculty of Medicine and Biomedical Sciences, University of Algarve, Faro, Portugal; Centre of Marine Sciences (CCMAR/CIMAR LA), University of Algarve, Faro, Portugal; University of Algarve - Campus de Gambelas, Faro 8005-139, Portugal.