Using Large Language Models to Automate Data Extraction From Surgical Pathology Reports: Retrospective Cohort Study.

Journal: JMIR formative research
PMID:

Abstract

BACKGROUND: Popularized by ChatGPT, large language models (LLMs) are poised to transform the scalability of clinical natural language processing (NLP) downstream tasks such as medical question answering (MQA) and automated data extraction from clinical narrative reports. However, the use of LLMs in the health care setting is limited by cost, computing power, and patient privacy concerns. Specifically, as interest in LLM-based clinical applications grows, regulatory safeguards must be established to avoid exposure of patient data through the public domain. The use of open-source LLMs deployed behind institutional firewalls may ensure the protection of private patient data. In this study, we evaluated the extraction performance of a locally deployed LLM for automated MQA from surgical pathology reports.

Authors

  • Denise Lee
    Department of Surgery, Icahn School of Medicine at Mount Sinai, 10 Union Square East, Suite 2L, New York, NY, 10003, United States, 1 212 241 2891.
  • Akhil Vaid
    Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, NY 10029 USA.
  • Kartikeya M Menon
    Department of Surgery, Icahn School of Medicine at Mount Sinai, 10 Union Square East, Suite 2L, New York, NY, 10003, United States, 1 212 241 2891.
  • Robert Freeman
    Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, 1 Gustave L Levy Pl, New York, NY 10029, USA.
  • David S Matteson
    Department of Statistics and Data Science, Cornell University, NYIthaca, USA.
  • Michael L Marin
    Department of Surgery, Icahn School of Medicine at Mount Sinai, 10 Union Square East, Suite 2L, New York, NY, 10003, United States, 1 212 241 2891.
  • Girish N Nadkarni
    Division of Data-Driven and Digital Medicine (D3M), Icahn School of Medicine at Mount Sinai, New York, NY, USA.