AI at the Helm: Evaluating Claude 3.5 Sonet and ChatGPT-4.0 in Tympanoplasty Management.
Journal:
Otology & neurotology : official publication of the American Otological Society, American Neurotology Society [and] European Academy of Otology and Neurotology
Published Date:
Jan 5, 2026
Abstract
BACKGROUND: Artificial intelligence (AI) is increasingly being integrated into health care, offering new possibilities for postoperative management. Large language models (LLMs) like ChatGPT-4 and Claude 3.5 Sonet have demonstrated potential in patient education and clinical support. This study evaluates their effectiveness in providing postoperative guidance following tympanoplasty, focusing on accuracy, clarity, and relevance. METHODS: Fifteen frequently asked postoperative tympanoplasty questions were compiled from 50 patients and validated by 15 otolaryngologists-ChatGPT-4 and Claude 3.5 Sonet generated responses under standardized conditions. AI-generated responses were assessed by the expert panel using a 5-point Likert scale for accuracy, response time, clarity, and relevance. Advanced statistical analysis was conducted to compare the models' performance, including Cohen kappa for inter-rater reliability, effect size calculations, and P -value analysis. RESULTS: Claude 3.5 Sonet consistently outperformed ChatGPT-4 across all evaluated parameters. It demonstrated superior accuracy, faster response times, improved clarity, and higher relevance in patient education ( P <0.001). Statistical analysis confirmed significant differences, with Claude achieving stronger inter-rater reliability and response consistency. CONCLUSION: Claude 3.5 Sonet demonstrated a notable advantage over ChatGPT-4 in providing structured and clinically accurate postoperative tympanoplasty guidance. These findings suggest that AI-driven conversational agents can enhance patient education and support postoperative care. However, further research is necessary to refine AI-based tools and evaluate their broader applicability in clinical practice. LEVEL OF EVIDENCE: Level III.
Authors
Keywords
No keywords available for this article.