Performance of Large Language Models in Supporting Medical Diagnosis and Treatment

Journal: arXiv

Published Date: Apr 14, 2025

Abstract

The integration of Large Language Models (LLMs) into healthcare holds significant potential to enhance diagnostic accuracy and support medical treatment planning. These AI-driven systems can analyze vast datasets, assisting clinicians in identifying diseases, recommending treatments, and predicting patient outcomes. This study evaluates the performance of a range of contemporary LLMs, including both open-source and closed-source models, on the 2024 Portuguese National Exam for medical specialty access (PNA), a standardized medical knowledge assessment. Our results highlight considerable variation in accuracy and cost-effectiveness, with several models demonstrating performance exceeding human benchmarks for medical students on this specific task. We identify leading models based on a combined score of accuracy and cost, discuss the implications of reasoning methodologies like Chain-of-Thought, and underscore the potential for LLMs to function as valuable complementary tools aiding medical professionals in complex clinical decision-making.

Authors

Diogo Sousa
Guilherme Barbosa
Catarina Rocha
Dulce Oliveira

External Resources

View on arXiv arXiv (http://arxiv.org/abs/2504.10405v1)

Performance of Large Language Models in Supporting Medical Diagnosis and Treatment

Abstract

Authors

Categories

External Resources

Popular Topics

Recent Journals

Performance of Large Language Models in Supporting Medical Diagnosis and Treatment

Abstract

Authors

Categories

External Resources

Stay Ahead of Medical AI

Popular Topics

Recent Journals