Evaluating the performance of artificial intelligence in summarizing pre-coded text to support evidence synthesis: a comparison between chatbots and humans.

Journal: BMC medical research methodology
Published Date:

Abstract

BACKGROUND: With the rise of large language models, the application of artificial intelligence in research is expanding, possibly accelerating specific stages of the research processes. This study aims to compare the accuracy, completeness and relevance of chatbot-generated responses against human responses in evidence synthesis as part of a scoping review.

Authors

  • Kim Nordmann
    Kempten University of Applied Sciences, Bavarian Research Center for Digital Health and Social Care, Kempten, Germany.
  • Stefanie Sauter
    Kempten University of Applied Sciences, Bavarian Research Center for Digital Health and Social Care, Kempten, Germany.
  • Mirjam Stein
    Kempten University of Applied Sciences, Bavarian Research Center for Digital Health and Social Care, Kempten, Germany.
  • Johanna Aigner
    Kempten University of Applied Sciences, Bavarian Research Center for Digital Health and Social Care, Kempten, Germany.
  • Marie-Christin Redlich
    Kempten University of Applied Sciences, Bavarian Research Center for Digital Health and Social Care, Kempten, Germany.
  • Michael Schaller
    Institute of Medical Informatics, Private University for Health Sciences, Medical Informatics and Technology, Hall in Tirol, Austria.
  • Florian Fischer
    Institute of Forensic Medicine, LMU Munich, Nußbaumstraße 26, 80336, Munich, Germany.