Agreement between three state-of-the-art deep learning bone age estimation models and chronological age in a large contemporary pediatric cohort.
Journal:
Pediatric radiology
Published Date:
Jun 3, 2026
Abstract
BACKGROUND: Radiographic bone age estimation is routinely performed in children to evaluate short stature, early or late puberty, and endocrine disorders, as well as for surgical planning. OBJECTIVE: To evaluate agreement between three state-of-the-art deep learning bone age estimation models and chronological age in a large contemporary pediatric cohort. MATERIALS AND METHODS: We retrospectively identified a large contemporary cohort of children (n=7,189; 3,669 females and 3,520 males) aged 24-216 months (mean 143.6±46.2 months) with consecutive radiologically normal hand (including the wrist) radiographs that were evaluated for trauma between November 1, 2010, and October 31, 2020. Bone age was estimated according to Greulich and Pyle (GP) Atlas standards using three state-of-the-art deep learning models (Stanford University, Cincinnati Children's Hospital Medical Center (CCHMC) , and MedImageInsight models), each producing continuous bone age estimates in months. Mean and proportional bias relative to chronological age were assessed. RESULTS: All three models systematically overestimated chronological age, although the magnitude of bias varied by model and sex. The CCHMC model demonstrated the smallest overall mean bias (+3.20 months), followed by the MedImageInsight (+4.60 months) and Stanford (+7.03 months) models. Mean differences compared with chronological age were statistically significant for all models (P<0.0001). Evidence of proportional bias was observed in most models and subgroups. For all three models, the difference between predicted bone age and chronological age was greater for Black compared to White and Hispanic compared to non-Hispanic children. CONCLUSION: GP Atlas-based bone age models systematically overestimate chronological age in a large contemporary cohort of children undergoing hand radiography for trauma, with biases related to age, sex, race, and ethnicity.
Authors
Keywords
No keywords available for this article.