DeepSeek-R1 and GPT-4 are comparable in a complex diagnostic challenge: a historical control study.
Journal:
International journal of surgery (London, England)
Published Date:
Apr 3, 2025
Abstract
BACKGROUND: Large language models (LLMs) have demonstrated potential in medical diagnostics, but their accuracy in complex cases remains a subject of investigation. DeepSeek-R1, an open-source model with advanced reasoning capabilities, has gained global attention. This study evaluates the diagnostic performance of DeepSeek-R1 compared to GPT-4 in complex clinical cases.