Comparative analysis of generative pre-trained transformers for text- and image-based cephalometric prompts using a novel Artificial Intelligence Based Diagnosis and Treatment Planning Index (AIDTI).
Journal:
BMC medical informatics and decision making
Published Date:
Jun 1, 2026
Abstract
INTRODUCTION: The aim of this study was to compare the response capabilities of generative pre-trained transformers (GPTs) for identical lateral cephalograms (LCs) based on text- and image-based prompts, using the newly developed Artificial Intelligence Based Diagnosis and Treatment Planning Index (AIDTI). METHODS: A total of 90 LCs from 30 cases each with skeletal Class I, II, and III malocclusions were included. The LCs were presented to GPT-4o, GPT-o3 pro, GPT-5, and GPT-5 pro in two different formats: text-based (numerical data including cephalometric analysis measurements) and image-based (direct image upload) prompts. The responses of the GPTs were evaluated using the newly developed AIDTI, scored on a 0-10 scale and consisting of five criteria: diagnostic accuracy, capability for differential diagnosis, clinical appropriateness of the proposed treatment plan, disclosure of risks and complications associated with the treatment plan, and ability to offer alternative treatment options. RESULTS: Higher performance was observed in text-based prompts, with the highest score achieved by GPT-5 pro (9.62 ± 1.13). By contrast, the performance of all GPTs on image-based prompts was notably lower, with the highest score being 4.16 ± 4.12 for GPT-o3 pro. Additionally, all models showed a tendency to randomly categorize malocclusions as Class II, suggesting systematic bias in their predictions. CONCLUSION: AIDTI provides a structured and multidimensional framework for the concise, balanced, and clinically meaningful interpretation of GPT performance. Because GPTs are unable to directly analyze LCs, their use as independent or reliable supportive tools in orthodontics remains limited at this stage. GPTs may contribute more reliably to orthodontic practice only when orthodontists prompt the model with text-based cephalometric measurement data. CLINICAL TRIAL NUMBER: Not applicable.
Authors
Keywords
No keywords available for this article.