Assessment of various artificial intelligence applications' performance in responding to multiple-choice endodontics questions.

Journal: BMC oral health
Published Date:

Abstract

BACKGROUND: Artificial intelligence (AI) has emerged as a transformative technology in the domain of healthcare, including endodontics. This study aims to evaluate and compare the performance of six AI chatbots (ScholarGPT, Scholar AI, ChatGPT-4o, Gemini 2.0, DeepSeek-R1, and ChatGPT-5) in answering multiple-choice questions related to endodontics. METHODS: The study evaluated the accuracy performance of six different AI chatbots in answering 122 multiple-choice questions related to endodontics asked in iterations of the Turkish Dentistry Specialization Exam (DUS) held between 2012 and 2021. The questions were divided into two categories: 'knowledge-based' and 'case-based'. The responses were categorized as 'correct' or 'incorrect'. The chatbots' performance levels in answering all knowledge-based and case-based questions correctly were recorded and compared. The relationship between the categorical variables was evaluated using descriptive statistics, Cochran's Q test, and exploratory pairwise comparisons using McNemar's test. The significance level was set at p < 0.05, with an adjusted significance level of p = 0.00416 for multiple comparisons. RESULTS: No statistically significant differences between the general accuracy performance of six different AI chatbots in answering DUS endodontic questions (p > 0.05) exist. No significant differences were found between the models for case-based and knowledge-based questions (p > 0.05). CONCLUSION: The evaluated AI chatbots demonstrated comparable accuracy in answering multiple-choice endodontic questions. While these findings suggest potential utility in dental education, the use of AI in clinical practice should be considered supportive rather than definitive. Therefore, ongoing evaluation and improvement of chatbot accuracy and reliability remain important.

Authors

Keywords

No keywords available for this article.