An explainable non-invasive hybrid machine learning framework for accurate prediction of thyroid-stimulating hormone levels.
Journal:
Computers in biology and medicine
PMID:
40058078
Abstract
Machine learning models, including thyroid biomarkers, are increasingly utilized in healthcare for biomarker prediction. These models offer the potential to enhance disease diagnosis through data-driven approaches relying on non-invasive techniques. However, no studies have explored the application of fully non-invasive methods for predicting thyroid-stimulating hormone (TSH) levels. Consequently, this study introduces a novel, fully non-invasive framework for predicting TSH levels by developing an innovative hybrid machine learning model that balances performance, complexity, and interpretability. Seven ML models were evaluated, and the best-performing models were integrated into a hybrid approach to balance performance, complexity, and interpretability. A dataset of 6190 instances from Jordan was used for model development. Four-dimensional non-invasive factors, including demographics, symptoms, family history, and newly engineered symptom scores, were incorporated into the model. The hybrid model achieved an R of 94.2 % and RMSE of 0.015, demonstrating superior predictive performance. Model interpretability was ensured using LIME and SHAP explainers, confirming aggregated symptom scores' critical role in enhancing prediction accuracy. A robust feature selection technique was implemented, reducing model complexity and enhancing performance. Among the top ten features for predicting TSH levels were hypothyroidism and hyperthyroidism symptom scores, family history, cold intolerance, itchy-dry skin, sweating, hand tremors, and palpitations. The model can be employed to develop cost-effective diagnostic tools for thyroid disorders. It also offers a robust framework that can be generalized to predict other biomarkers and applied in diverse contexts.