Comparison of the Performance of GPT-3.5 and GPT-4 With That of Medical Students on the Written German Medical Licensing Examination: Observational Study.

Journal: JMIR medical education

Published Date: Feb 8, 2024

Abstract

BACKGROUND: The potential of artificial intelligence (AI)-based large language models, such as ChatGPT, has gained significant attention in the medical field. This enthusiasm is driven not only by recent breakthroughs and improved accessibility, but also by the prospect of democratizing medical knowledge and promoting equitable health care. However, the performance of ChatGPT is substantially influenced by the input language, and given the growing public trust in this AI tool compared to that in traditional sources of information, investigating its medical accuracy across different languages is of particular importance.

Authors

Annika Meyer

Institute for Clinical Chemistry, University Hospital Cologne, Cologne, Germany.
Janik Riese

Department of General Surgery, Visceral, Thoracic and Vascular Surgery, University Hospital Greifswald, Greifswald, Germany.
Thomas Streichert

Institute for Clinical Chemistry, University Hospital Cologne, Cologne, Germany.

Keywords

Artificial Intelligence Education, Medical Educational Measurement Humans Language Students, Medical

External Resources

View on PubMed Access via DOI PubMed (38329802)

Comparison of the Performance of GPT-3.5 and GPT-4 With That of Medical Students on the Written German Medical Licensing Examination: Observational Study.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals