DeepSeek Versus GPT: Evaluation of Large Language Model Chatbots' Responses on Orofacial Clefts.

Journal: The Journal of craniofacial surgery

Published Date: Apr 17, 2025

Abstract

Advancements in natural language processing (NLP) have led to the emergence of large language models (LLMs) as potential tools for patient consultations. This study investigates the ability of reasoning-capable models to provide diagnostic and treatment recommendations for orofacial clefts. A cross-sectional comparative study was conducted using 20 questions based on Google Trends and expert experience, with both models providing responses to these queries. Readability was assessed using the Flesch-Kincaid Reading Ease (FRES), Flesch-Kincaid Grade Level (FKGL), sentence count, number of sentences, and percentage of complex words. No statistically significant differences were found in the readability metrics for FKGL (P = 0.064) and FRES (P = 0.56) between the responses of the 2 models. Physician evaluation using a 4-point Likert scale assessed accuracy, clarity, relevance, and trustworthiness, with Deepseek-R1 achieving significantly higher ratings overall (P = 0.041). However, GPT o1-preview exhibited notable empathy in certain clinical scenarios. Both models displayed complementary strengths, indicating potential for clinical consultation applications. Future research should focus on integrating these strengths within medical-specific LLMs to generate more reliable, empathetic, and personalized treatment recommendations.

Authors

Hongru Zhou

Department of Cleft Lip and Palate, Plastic Surgery Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing.
Zhiyan Wang

Department of Cleft Lip and Palate, Plastic Surgery Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing.
Rongsheng Wang

Macao Polytechnic university, Faculty of Applied Sciences, Macau, China.
Leheng Jiang

Department of Cleft Lip and Palate, Plastic Surgery Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing.
Congxiao Zhu

Department of Cleft Lip and Palate, Plastic Surgery Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing.
Haoyue Guo

Department of Cleft Lip and Palate, Plastic Surgery Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing.
Tao Song

Department of Cleft Lip and Palate, Plastic Surgery Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing.
Ningbei Yin

Department of Cleft Lip and Palate, Plastic Surgery Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing.

Keywords

No keywords available for this article.

External Resources

View on PubMed Access via DOI PubMed (40245329)

DeepSeek Versus GPT: Evaluation of Large Language Model Chatbots' Responses on Orofacial Clefts.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals