The Best of All Worlds: A Hybrid Approach to Cohort Identification with Rules, Small and Large Language Models.

Journal: Studies in health technology and informatics
Published Date:

Abstract

Balancing operational feasibility with the performance of natural language processing (NLP) systems is a significant challenge. This study presents a hybrid strategy to integrate manually curated rules, small language model (SLM), and large language model (LLM) for cohort identification tasks. This approach demonstrates superior performance in terms of both computational efficiency and NLP validity, as shown here in two separate tasks using large number of clinical notes from the US Department of Veteran Affairs (VA) Healthcare system.

Authors

  • Qiwei Gan
    VA Salt Lake City Health Care System, 500, Foothill Boulevard, Salt Lake City 84148, USA; Division of Epidemiology, University of Utah, 295 Chipeta Way, Salt Lake City 84132, USA.
  • Jianlin Shi
    University of Utah, Salt Lake City, UT, USA.
  • Annie Bowles
    VA Salt Lake City Health Care System, Salt Lake City, UT, USA.
  • Elizabeth Hanchrow
    VA Salt Lake City Health Care System, Salt Lake City, UT, USA.
  • John Stanley
    VA Salt Lake City Health Care System, Salt Lake City, UT, USA.
  • Mengke Hu
    Department of Biomedical Informatics, University of Utah, Salt Lake City, Utah, United States.
  • Scott L DuVall
    VA Salt Lake City Health Care System.
  • Patrick R Alba
    VA Salt Lake City Health Care System.