Comparative performance of large language models on cardiovascular certification simulation exam.
Journal:
American heart journal
Published Date:
Jan 12, 2026
Abstract
Artificial intelligence (AI) is becoming increasingly prevalent in medical practice and has demonstrated sufficient clinical acumen to pass several licensing examinations. We tested the ability of 3 popular large-language models, ChatGPT-4.0 (OpenAI), Gemini (Google), and Bing AI (Microsoft), to pass a cardiovascular medicine board-style exam. Of these AI platforms, only ChatGPT-4.0 was able to achieve a score similar to human participants.
Authors
Keywords
No keywords available for this article.