Comprehensive testing of large language models for extraction of structured data in pathology.

Journal: Communications medicine

Published Date: Mar 31, 2025

Abstract

BACKGROUND: Pathology departments generate large volumes of unstructured data as free-text diagnostic reports. Converting these reports into structured formats for analytics or artificial intelligence projects requires substantial manual effort by specialized personnel. While recent studies show promise in using advanced language models for structuring pathology data, they primarily rely on proprietary models, raising cost and privacy concerns. Additionally, important aspects such as prompt engineering and model quantization for deployment on consumer-grade hardware remain unaddressed.

Authors

Bastian Grothey

Institute of Pathology, University Hospital Cologne, Cologne, Germany. bastian.grothey@uk-koeln.de.
Jan Odenkirchen

Medical Faculty, University of Cologne, Cologne, Germany.
Adnan Brkic

Institute of Pathology, University Hospital Cologne, Cologne, Germany.
Birgid Schömig-Markiefka

Institute of Pathology, University Hospital Cologne, Cologne, Germany.
Alexander Quaas

Institute of Pathology, University Hospital Cologne, Cologne, Germany.
Reinhard Büttner

Institute of Pathology, University Hospital Cologne, Cologne, Germany.
Yuri Tolkach

Institute of Pathology, University Hospital Cologne, Cologne, Germany. yuri.tolkach@gmail.com.

Keywords

No keywords available for this article.

External Resources

View on PubMed Access via DOI PubMed (40164789)

Comprehensive testing of large language models for extraction of structured data in pathology.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals