Delirium Identification from Nursing Reports Using Large Language Models.

Journal: Studies in health technology and informatics
Published Date:

Abstract

This study investigates large language models for delirium detection from nursing reports, comparing keyword matching, prompting, and finetuning. Using a manually labelled dataset from the University Hospital Freiburg, Germany, we tested Llama3 and Phi3 models. Both prompting and finetuning were effective, with finetuning Phi3 (3.8B) achieving the highest accuracy (90.24%) and AUROC (96.07%), significantly outperforming other methods.

Authors

  • Lisa Graf
    Neurorobotics Lab, Department of Computer Science - University of Freiburg, Germany.
  • Alexander Ritzi
    Center of Implementing Nursing Care Innovations Freiburg, Medical Center - University of Freiburg, Germany.
  • Lili M Schoeler
    Center of Implementing Nursing Care Innovations Freiburg, Medical Center - University of Freiburg, Germany.