Comprehensive testing of large language models for extraction of structured data in pathology.

Journal: Communications medicine
Published Date:

Abstract

BACKGROUND: Pathology departments generate large volumes of unstructured data as free-text diagnostic reports. Converting these reports into structured formats for analytics or artificial intelligence projects requires substantial manual effort by specialized personnel. While recent studies show promise in using advanced language models for structuring pathology data, they primarily rely on proprietary models, raising cost and privacy concerns. Additionally, important aspects such as prompt engineering and model quantization for deployment on consumer-grade hardware remain unaddressed.

Authors

  • Bastian Grothey
    Institute of Pathology, University Hospital Cologne, Cologne, Germany. bastian.grothey@uk-koeln.de.
  • Jan Odenkirchen
    Medical Faculty, University of Cologne, Cologne, Germany.
  • Adnan Brkic
    Institute of Pathology, University Hospital Cologne, Cologne, Germany.
  • Birgid Schömig-Markiefka
    Institute of Pathology, University Hospital Cologne, Cologne, Germany.
  • Alexander Quaas
    Institute of Pathology, University Hospital Cologne, Cologne, Germany.
  • Reinhard Büttner
    Institute of Pathology, University Hospital Cologne, Cologne, Germany.
  • Yuri Tolkach
    Institute of Pathology, University Hospital Cologne, Cologne, Germany. yuri.tolkach@gmail.com.

Keywords

No keywords available for this article.