Comparative analysis of ChatGPT 3.5 and ChatGPT 4 obstetric and gynecological knowledge.

Journal: Scientific reports

Published Date: Jul 1, 2025

Abstract

Generative Pretrained Transformer (GPT) is one of the most ubiquitous large language models (LLMs), employing artificial intelligence (AI) to generate human-like language. Although the use of ChatGPT has been evaluated in different medical specialties, sufficient evidence in the field of obstetrics and gynecology is still lacking. The aim of our study was to analyze the knowledge of the two latest generations of ChatGPT (ChatGPT-3.5 and ChatGPT-4) in the area of obstetrics and gynecology, and thereby to assess their potential applicability in clinical practice. We submitted 352 single-best-answer questions from the Polish Specialty Certificate Examinations in Obstetrics and Gynecology to ChatGPT-3.5 and ChatGPT-4, in both Polish and English. The models' accuracy was evaluated, and performance was analyzed based on question difficulty and language. Statistical analyses were conducted using the Mann-Whitney U test and the chi-square test. The results of the study indicate that both LLMs demonstrate satisfactory knowledge in the analyzed specialties. Nonetheless, we observed a significant superiority of ChatGPT-4 over its predecessor regarding the accuracy of answers. The correctness of answers of both models was associated with the difficulty index of questions. In addition, based on our analysis, ChatGPT should be used in English for optimal performance.

Authors

Franciszek Ługowski

1st Department of Obstetrics and Gynecology, Medical University of Warsaw, Warsaw, Poland. franciszeklugowski@gmail.com.
Julia Babińska

1st Department of Obstetrics and Gynecology, Medical University of Warsaw, Warsaw, Poland.
Artur Ludwin

1st Department of Obstetrics and Gynecology, Medical University of Warsaw, Warsaw, Poland.
Paweł Jan Stanirowski

1st Department of Obstetrics and Gynecology, Medical University of Warsaw, Warsaw, Poland.

Keywords

Artificial Intelligence Female Generative Artificial Intelligence Gynecology Humans Obstetrics Poland

External Resources

View on PubMed Access via DOI PubMed (40596684)

Comparative analysis of ChatGPT 3.5 and ChatGPT 4 obstetric and gynecological knowledge.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals

Comparative analysis of ChatGPT 3.5 and ChatGPT 4 obstetric and gynecological knowledge.

Abstract

Authors

Keywords

External Resources

Stay Ahead of Medical AI

Popular Topics

Recent Journals