Accuracy of Artificial Intelligence in Making Diagnoses and Treatment Decisions in Pediatric Dentistry.
Journal:
Pediatric dentistry
PMID:
40296263
Abstract
To assess the diagnostic and treatment decision-making accuracy of ChatGPT for various dental problems in pediatric patients compared to specialized pediatric dentists. This study included 12 cases, each with an average of three dental problems, resulting in a total of 36 dental problems. Successive prompts were given to ChatGPT (GPT-4), beginning with a comprehensive case presentation, followed by clinical and radiographic descriptions alongside clinical and radiographic images. Inputs for questions regarding the diagnosis and treatment were provided to the models. Accuracy was then scored based on the degree of alignment between the ChatGPT outputs and the pediatric dentistry committee decisions, which represented the control group based on their advanced training and clinical experience. ChatGPT's diagnostic accuracy was 72.2 percent, with a kappa statistic of 0.69 (95 percent confidence interval [95% CI] equals 0.6 to 0.8). In detecting dental caries, ChatGPT achieved a sensitivity of 92.3 percent and a specificity of 100 percent, with positive and negative predictive values of 100 percent and 83.3 percent, respectively. ChatGPT's treatment decision accuracy was 47.2 percent with a kappa value of 0.43 (95% CI equals 0.4 to 0.6). The difference between the accuracy of ChatGPT in diagnosis and treatment decisions was statistically significant (P=0.01). ChatGPT achieved high diagnostic accuracy but had limited capability in making treatment decisions for pediatric dental problems. ChatGPT may serve as a secondary aid in diagnosis; however, it cannot be perceived as a reliable tool for therapeutic decision-making.