Using Large Language Models to Determine Reasons for Missed Colon Cancer Screening Follow-Up

Journal: medRxiv
Published Date:

Abstract

Identifying reasons for missed preventive care, such as follow-up colonoscopy after an abnormal stool-based colon cancer screening test, is critical for quality improvement initiatives. However, manual chart review to extract this information from unstructured clinical notes is time-consuming and costly. To determine whether a large language model (LLM) can accurately extract reasons for a lack of follow-up colonoscopy after abnormal outpatient fecal immunohistochemical test (FIT) or fecal occult blood test (FOBT). Cross-sectional study. University of California, San Francisco (UCSF). Adult patients aged 45 years or older with an abnormal outpatient FIT/FOBT between 2012 and 2024 who did not undergo a colonoscopy within 90 days of the abnormal test. We investigate the potential of an LLM to determine whether reasons for a lack of follow-up colonoscopy are documented in the clinical notes and whether an LLM can accurately classify those reasons into clinically meaningful categories. Accuracy score was calculated to evaluate LLM performance against a 10% subsample manually classified by a physician reviewer. From a total of 2164 patients with abnormal FIT/FOBTs performed at UCSF during the study period, 355 (16.4%) underwent a colonoscopy within 90 days of the abnormal test. Among those who did not receive a colonoscopy within 90 days, 846 patients were eligible for the main analysis. Based on LLM categorization of patient note content, 270 (31.9%) patients did not have any reference to colonoscopy/colorectal cancer screening in their notes, 379 (44.8%) patients had mentions of colonoscopy/colorectal cancer screening without explicit reasons for not having a colonoscopy provided, and 197 (23.3%) patients had notes detailing explicit reasons for not having a colonoscopy. Overall LLM classification accuracy was 89.3%. The most common reasons for not having a colonoscopy included: Refused/not interested (n = 96; 35.2%), Comorbidities (n = 51; 18.7%), and Patient Unavailable (n = 46; 16.8%). This study suggests that an LLM can accurately identify and categorize reasons for the absence of follow-up colonoscopy after an abnormal FIT/FOBT. Our results suggest that LLMs have the potential to automate chart review for quality improvement initiatives.

Authors

  • Christopher Y.K. Williams; Urmimala Sarkar; Julia Adler-Milstein; Lisa Rotenstein