Highly accurate classification of chest radiographic reports using a deep learning natural language model pre-trained on 3.8 million text reports.

Journal: Bioinformatics (Oxford, England)

Published Date: Jan 29, 2021

Abstract

MOTIVATION: The development of deep, bidirectional transformers such as Bidirectional Encoder Representations from Transformers (BERT) led to an outperformance of several Natural Language Processing (NLP) benchmarks. Especially in radiology, large amounts of free-text data are generated in daily clinical workflow. These report texts could be of particular use for the generation of labels in machine learning, especially for image classification. However, as report texts are mostly unstructured, advanced NLP methods are needed to enable accurate text classification. While neural networks can be used for this purpose, they must first be trained on large amounts of manually labelled data to achieve good results. In contrast, BERT models can be pre-trained on unlabelled data and then only require fine tuning on a small amount of manually labelled data to achieve even better results.

Authors

Keno K Bressem

School of Medicine and Health, Institute for Cardiovascular Radiology and Nuclear Medicine, German Heart Center Munich, TUM University Hospital, Technical University of Munich, Munich, Germany.
Lisa C Adams

School of Medicine and Health, Department of Diagnostic and Interventional Radiology, Klinikum rechts der Isar, TUM University Hospital, Technical University of Munich, Munich, Germany.
Robert A Gaudin

Department of Oral- and Maxillofacial Surgery, Charité, Berlin 12203, Germany.
Daniel Tröltzsch

Department of Oral- and Maxillofacial Surgery, Charité, Berlin 12203, Germany.
Bernd Hamm

Department of Diagnostic and Interventional Radiology, Charité -Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany.
Marcus R Makowski

School of Medicine and Health, Department of Diagnostic and Interventional Radiology, Klinikum rechts der Isar, TUM University Hospital, Technical University of Munich, Munich, Germany.
Chan-Yong Schüle

Department of Radiology, Charité, Berlin 12203, Germany.
Janis L Vahldiek

Charité Universitätsmedizin Berlin, Campus Benjamin Franklin, Hindenburgdamm 30, 12203, Berlin, Germany.
Stefan M Niehues

Charité Universitätsmedizin Berlin, Campus Benjamin Franklin, Hindenburgdamm 30, 12203, Berlin, Germany.

Keywords

Deep Learning Humans Information Storage and Retrieval Machine Learning Natural Language Processing Neural Networks, Computer

External Resources

View on PubMed Access via DOI PubMed (32702106)

Highly accurate classification of chest radiographic reports using a deep learning natural language model pre-trained on 3.8 million text reports.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals