Evaluating Artificial Intelligence-Driven Responses to Acute Liver Failure Queries: A Comparative Analysis Across Accuracy, Clarity, and Relevance.

Journal: The American journal of gastroenterology
Published Date:

Abstract

INTRODUCTION: Recent advancements in artificial intelligence (AI), particularly through the deployment of large language models (LLMs), have profoundly impacted healthcare. This study assesses 5 LLMs-ChatGPT 3.5, ChatGPT 4, BARD, CLAUDE, and COPILOT-on their response accuracy, clarity, and relevance to queries concerning acute liver failure (ALF). We subsequently compare these results with ChatGPT4 enhanced with retrieval augmented generation (RAG) technology.

Authors

  • Sheza Malik
    Internal Medicine, Rochester General Hospital, Rochester, New York, USA.
  • Lewis J Frey
    Ralph H. Johnson Veterans Affairs Medical Center, Charleston, South Carolina, USA.
  • Jason Gutman
    Gastroenterology & Hepatology, Rochester General Hospital, Rochester, New York, USA.
  • Asim Mushtaq
    Gastroenterology & Hepatology, Rochester General Hospital, Rochester, New York, USA.
  • Fatima Warraich
    Gastroenterology & Hepatology, Rochester General Hospital, Rochester, New York, USA.
  • Kamran Qureshi
    Gastroenterology & Hepatology, Saint Louis University, St. Louis, Missouri, USA.

Keywords

No keywords available for this article.