Evaluating accuracy and reproducibility of large language model performance on critical care assessments in pharmacy education.

Journal: Frontiers in artificial intelligence
Published Date:

Abstract

BACKGROUND: Large language models (LLMs) have demonstrated impressive performance on medical licensing and diagnosis-related exams. However, comparative evaluations to optimize LLM performance and ability in the domain of comprehensive medication management (CMM) are lacking. The purpose of this evaluation was to test various LLMs performance optimization strategies and performance on critical care pharmacotherapy questions used in the assessment of Doctor of Pharmacy students.

Authors

  • Huibo Yang
    Department of Computer Science, University of Virginia, Charlottesville, VA, United States.
  • Mengxuan Hu
    School of Data Science, University of Virginia, Charlottesville, VA, United States.
  • Amoreena Most
    University of Georgia College of Pharmacy, Augusta, GA, United States.
  • W Anthony Hawkins
    Department of Clinical and Administrative Pharmacy, University of Georgia College of Pharmacy, Albany, GA, United States.
  • Brian Murray
    University of Colorado Skaggs Schools of Pharmacy and Pharamceutical Sciences, Aurora, CO, United States.
  • Susan E Smith
    Department of Clinical and Administrative Pharmacy, University of Georgia College of Pharmacy, Athens, GA, United States.
  • Sheng Li
    School of Data Science, University of Virginia, Charlottesville, VA, United States.
  • Andrea Sikora
    Department of Clinical and Administrative Pharmacy, University of Georgia College of Pharmacy, Augusta, GA, United States.

Keywords

No keywords available for this article.