From open-ended to multiple-choice: evaluating diagnostic performance and consistency of ChatGPT, Google Gemini and Claude AI.
Journal:
Wiadomosci lekarskie (Warsaw, Poland : 1960)
PMID:
39661873
Abstract
OBJECTIVE: Aim: To determine the performance and response repeatability of freely available LLMs in diagnosing diseases based on clinical case descriptions.