Comparative study of the performance of two versions of the same AI tool for pediatric skeletal maturation assessment.

Journal: European radiology
Published Date:

Abstract

OBJECTIVES: To compare BoneXpert version 3 with version 2 in estimating bone age and bone health index (BHI), focusing on mean absolute error (MAE) and root mean square error (RMSE) in healthy children. MATERIALS AND METHODS: This retrospective study included 449 healthy children: 231 females aged 2.11-15.88 years (mean 8.84) and 218 males aged 3.09-15.94 years (mean 9.55). Bone age was assessed using both versions. Chronological age was recorded, and correlations between estimated and chronological ages were calculated (R2). Accuracy was evaluated using MAE and RMSE, with analyses stratified by sex and age group. Bone age standard deviation scores (SDS), the BHI, and their variability were compared between versions. RESULTS: Both versions showed strong correlations with chronological age (R2 = 0.93 for females, 0.91 for males in both). MAE was 0.89 years (95% CI: 0.07) for version 2 and 0.88 years (95% CI: 0.07) for version 3 (p > 0.05). RMSE increased with age and was higher in males. Overall RMSE was 1.15 (95% CI: 0.04) in version 2 and 1.12 (95% CI: 0.04) in version 3. Bone age SDS was higher with version 3 (mean 0.56) than with version 2 (mean 0.19) and more variable (SDS 1.51 vs. 1.29). Version 3 also provided SDS in the youngest age group. No significant differences were observed in the BHI or its SDS. CONCLUSION: Both BoneXpert versions are effective for bone age assessment in healthy children, with similar accuracy. Version 3 produces higher and more variable bone age SDS values and extends SDS reporting to younger ages. KEY POINTS: Question The reliability of the new version of the AI-supported program for bone age estimation in children needs to be evaluated. Findings In a sample of 449 healthy children, version 3 yields higher and more variable bone age SDS values and expands SDS availability to younger ages. Clinical relevance This comparative study draws the reader's attention to the fact that the modifications applied to the AI-supported program for bone age estimation are not always available to the users, and that its reliability and accuracy need to be validated.

Authors

  • Grammatina Boitsios
    Department of Radiology, Hôpital Delta, Centre Hospitalier Interrégional Edith Cavell (CHIREC), Brussels, Belgium. [email protected].
  • Thomas Saliba
    Department of Diagnostic and Interventional Radiology, Lausanne University Hospital, University of Lausanne, Rue du Bugnon 46, 1011 Lausanne, Switzerland.
  • Paolo Simoni
    Department of Radiology, Centre Hospitalier du Luxembourg (CHL), Luxembourg, Luxembourg.

Keywords

No keywords available for this article.