Large Language Models' Ability to Assess Main Concepts in Story Retelling: A Proof-of-Concept Comparison of Human Versus Machine Ratings.

Journal: American journal of speech-language pathology
Published Date:

Abstract

PURPOSE: Despite an abundance of manual, labor-intensive discourse analysis methods, there remains a dearth of clinically convenient, psychometrically robust instruments to measure change in real-world communication in aphasia. The Brief Assessment of Transactional Success (BATS) addresses this gap while developing automated methods for analyzing story retelling discourse. This study investigated automation of main concept (MC) analysis of stories by comparing scores from three large language models (LLMs) to those of human raters.

Authors

  • Jacquie Kurland
    Department of Speech, Language, and Hearing Sciences, University of Massachusetts Amherst.
  • Vishnupriya Varadharaju
    Manning College of Information & Computer Sciences, University of Massachusetts Amherst.
  • Anna Liu
    Department of Mathematics and Statistics, University of Massachusetts Amherst.
  • Polly Stokes
    Department of Speech, Language, and Hearing Sciences, University of Massachusetts Amherst.
  • Ankita Gupta
    Manning College of Information & Computer Sciences, University of Massachusetts Amherst.
  • Marisa Hudspeth
    Manning College of Information & Computer Sciences, University of Massachusetts Amherst.
  • Brendan O'Connor
    Manning College of Information & Computer Sciences, University of Massachusetts Amherst.

Keywords

No keywords available for this article.