Modeling the number of new cases of childhood type 1 diabetes using Poisson regression and machine learning methods; a case study in Saudi Arabia.
Journal:
PloS one
PMID:
40279367
Abstract
Diabetes mellitus stands out as one of the most prevalent chronic conditions affecting pediatric populations. The escalating incidence of childhood type 1 diabetes (T1D) globally is a matter of increasing concern. Developing an effective model that leverages Key Performance Indicators (KPIs) to understand the incidence of T1D in children would significantly assist medical practitioners in devising targeted monitoring strategies. This study models the number of monthly new cases of T1D and its associated KPIs among children aged 0 to 14 in Saudi Arabia. The study involved collecting de-identified data (n=377) from diagnoses made between 2010 and 2020, sourced from pediatric diabetes centers in three cities across Saudi Arabia. Poisson regression (PR), and various machine learning (ML) techniques, including random forest (RF), support vector machine (SVM), and K-nearest neighbor (KNN), were employed to model the monthly number of new T1D cases using the local data. The performance of these models was assessed using both numbers of KPIs and metrics such as the coefficient of determination ([Formula: see text]), root mean squared error (RMSE), and mean absolute error (MAE). Among various Poisson and ML models, both model considering birth weight over 3.5 kg, maternal age over 25 years at the child's birth, family history of T1D, and nutrition history, specifically early introduction to cow milk and model taking into account birth weight over 3.5 kg, maternal age over 25 years at the child's birth, and nutrition history (early introduction to cow milk) emerged as the best-reduced models. They achieved [Formula: see text] of (0.89,0.88), RMSE (0.82, 0.95) and MAE(0.62,0.67). Additionally, models with fewer KPIs, like model that considers maternal age over 25 years and early introduction to cow milk, achieved consistently high [Formula: see text] values ranging from 0.80 to 0.83 across all models. Notably, this model demonstrated smaller values of RMSE (0.92) and MAE (0.67) in the KNN model. Simplified models facilitate the efficient creation and monitoring of KPIs profiles. The findings can assist healthcare providers in collecting and monitoring influential KPIs, enabling the development of targeted strategies to potentially reduce, or reverse, the increasing incidence rate of childhood T1D in Saudi Arabia.