The Best of All Worlds: A Hybrid Approach to Cohort Identification with Rules, Small and Large Language Models.
Journal:
Studies in health technology and informatics
Published Date:
Aug 7, 2025
Abstract
Balancing operational feasibility with the performance of natural language processing (NLP) systems is a significant challenge. This study presents a hybrid strategy to integrate manually curated rules, small language model (SLM), and large language model (LLM) for cohort identification tasks. This approach demonstrates superior performance in terms of both computational efficiency and NLP validity, as shown here in two separate tasks using large number of clinical notes from the US Department of Veteran Affairs (VA) Healthcare system.