Artificial intelligence in pediatric ophthalmology: a comparative study of ChatGPT-4.0 and DeepSeek-R1 performance.

Journal: Strabismus

Published Date: Jul 29, 2025

Abstract

: This study aims to evaluate and compare the accuracy and performance of two large language models (LLMs), ChatGPT-4.0 and DeepSeek-R1, in answering pediatric ophthalmology-related questions. : A total of 44 multiple-choice questions were selected, covering various subspecialties of pediatric ophthalmology. Both LLMs were tasked with answering these questions, and their responses were compared in terms of accuracy. : ChatGPT-4.0 correctly answered 82% of the questions, while DeepSeek-R1 achieved a higher accuracy rate of 93% (p: 0.06). In strabismus, ChatGPT-4.0 answered 70% of questions correctly, while DeepSeek-R1 achieved 82% (p: 0.50). In other subspecialties, ChatGPT-4.0 answered 89% correctly, and DeepSeek-R1 achieved 100% accuracy (p: 0.25). : DeepSeek-R1 outperformed ChatGPT-4.0 in overall accuracy, particularly in pediatric ophthalmology. These findings suggest the need for further optimization of LLM models to enhance their performance and reliability in clinical settings, especially in pediatric ophthalmology.

Authors

Gamze Karataş

Department of Ophthalmology, Prof. Dr. Cemil Taşcıoğlu City Hospital, Istanbul, Turkey.
Mehmet Egemen Karataş

Department of Ophthalmology, Şişli Hamidiye Etfal Training and Research Hospital, Istanbul, Turkey.

Keywords

No keywords available for this article.

External Resources

View on PubMed Access via DOI PubMed (40726359)

Artificial intelligence in pediatric ophthalmology: a comparative study of ChatGPT-4.0 and DeepSeek-R1 performance.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals

Artificial intelligence in pediatric ophthalmology: a comparative study of ChatGPT-4.0 and DeepSeek-R1 performance.

Abstract

Authors

Keywords

External Resources

Don't Miss the Future of Medicine

Popular Topics

Recent Journals