Evaluation of ChatGPT-generated medical responses: A systematic review and meta-analysis.
Journal:
Journal of biomedical informatics
Published Date:
Mar 8, 2024
Abstract
OBJECTIVE: Large language models (LLMs) such as ChatGPT are increasingly explored in medical domains. However, the absence of standard guidelines for performance evaluation has led to methodological inconsistencies. This study aims to summarize the available evidence on evaluating ChatGPT's performance in answering medical questions and provide direction for future research.