Machine learning based predictive model of the risk of Tourette syndrome with SHAP value interpretation: a retrospective observational study.
Journal:
Scientific reports
Published Date:
May 26, 2025
Abstract
Tourette syndrome is a relatively prevalent neurological condition, particularly among children, characterized by sudden, involuntary, repetitive movements or vocalizations. Contemporary diagnostic approaches for Tourette syndrome (TS) primarily rely on behavioral assessments, which pose challenges due to symptom overlap with other psychiatric disorders and significant inter-individual variability. Establishing a machine learning-based predictive model for predicting the risk of TS could potentially enhance diagnostic precision and treatment effectiveness. The investigation was conducted at the Department of Pediatrics, Affiliated Hospital of Jiangnan University, spanning the period from January 2022 to October 2024. Clinical data, encompassing complete blood counts, liver and kidney function assessments, blood glucose levels, and serum electrolyte analyses, were collected. Feature selection was conducted using Boruta and multivariable logistic regression analyses, and model construction was undertaken employing 9 distinct machine learning algorithms. 10 distinct features were selected for machine learning algorithm development, and our results indicated that the Gradient Boosting Machine algorithm is the optimal model. Our study successfully established a predictive model for the risk of Tourette syndrome using Gradient Boosting Machine, and the SHAP method highlighted the key roles of β2-microglobulin and serum 25-hydroxyvitamin D levels in predicting TS risk.