Comparative analysis of natural language processing methodologies for classifying computed tomography enterography reports in Crohn's disease patients.

Journal: NPJ digital medicine
Published Date:

Abstract

Imaging is crucial to assess disease extent, activity, and outcomes in inflammatory bowel disease (IBD). Artificial intelligence (AI) image interpretation requires automated exploitation of studies at scale as an initial step. Here we evaluate natural language processing to classify Crohn's disease (CD) on CTE. From our population representative IBD registry a sample of CD patients (male: 44.6%, median age: 50 IQR37-60) and controls (n = 981 each) CTE reports were extracted and split into training- (n = 1568), development- (n = 196), and testing (n = 198) datasets each with around 200 words and balanced numbers of labels, respectively. Predictive classification was evaluated with CNN, Bi-LSTM, BERT-110M, LLaMA-3.3-70B-Instruct and DeepSeek-R1-Distill-LLaMA-70B. While our custom IBDBERT finetuned on expert IBD knowledge (i.e. ACG, AGA, ECCO guidelines), outperformed rule- and rationale extraction-based classifiers (accuracy 88.6% with pre-tuning learning rate 0.00001, AUC 0.945) in predictive performance, LLaMA, but not DeepSeek achieved overall superior results (accuracy 91.2% vs. 88.9%, F1 0.907 vs. 0.874).

Authors

  • Jiayi Dai
    College of Humanities and Social Science, North Carolina State University, NC 27695, USA.
  • Mi-Young Kim
    College of Natural and Applied Sciences, University of Alberta, Edmonton, AB, Canada.
  • Reed T Sutton
    Division of Gastroenterology, University of Alberta, 130 University Campus, Edmonton, AB, T6G 2X8, Canada.
  • J Ross Mitchell
    Mayo Clinic, Scottsdale, Dept. of Research.
  • Randolph Goebel
    Department of Computing Science, University of Alberta, Edmonton, AB, Canada.
  • Daniel C Baumgart
    Division of Gastroenterology, University of Alberta, 130 University Campus, Edmonton, AB, T6G 2X8, Canada. baumgart@ualberta.ca.

Keywords

No keywords available for this article.