Development and validation of a machine learning model for predicting the risk of current depression in physically inactive adults in the United States.
Journal:
Journal of affective disorders
Published Date:
Oct 24, 2025
Abstract
Depression is a major global public health concern, with physical inactivity recognized as a key modifiable risk factor. However, tools for predicting depression risk among physically inactive adults are limited. This study aimed to develop and validate a machine learning model for identifying current depression risk in this population. Data from 6801 physically inactive adults in NHANES 2005-2020 were analyzed. Depression was defined by a PHQ-9 score > 9. LASSO regression and multivariable logistic regression identified seven key predictors: sleep disorders, poverty income ratio, waist circumference, neutrophil-to-lymphocyte ratio, sex, systemic immune-inflammation index, and age. Six machine learning models-logistic regression, random forest, extreme gradient boosting (XGBoost), support vector machine (SVM), naïve Bayes, and k-nearest neighbors (KNN)-were constructed and compared. The logistic regression model demonstrated the best performance (AUC = 0.769), with robust validation across three external cohorts (AUC = 0.736-0.794). A clinically applicable nomogram was developed to facilitate risk estimation. This model provides an effective tool for early identification of depression risk in physically inactive adults, supporting targeted prevention and intervention strategies.
Authors
Keywords
No keywords available for this article.