Limitations of Transformers on Clinical Text Classification.

Journal: IEEE journal of biomedical and health informatics

Published Date: Sep 3, 2021

Abstract

Bidirectional Encoder Representations from Transformers (BERT) and BERT-based approaches are the current state-of-the-art in many natural language processing (NLP) tasks; however, their application to document classification on long clinical texts is limited. In this work, we introduce four methods to scale BERT, which by default can only handle input sequences up to approximately 400 words long, to perform document classification on clinical texts several thousand words long. We compare these methods against two much simpler architectures - a word-level convolutional neural network and a hierarchical self-attention network - and show that BERT often cannot beat these simpler baselines when classifying MIMIC-III discharge summaries and SEER cancer pathology reports. In our analysis, we show that two key components of BERT - pretraining and WordPiece tokenization - may actually be inhibiting BERT's performance on clinical text classification tasks where the input document is several thousand words long and where correctly identifying labels may depend more on identifying a few key words or phrases rather than understanding the contextual meaning of sequences of text.

Authors

Shang Gao

Department of Orthopedics, Orthopedic Center of Chinese PLA, Southwest Hospital, Third Military Medical University, Chongqing, 400038, P.R.China.
Mohammed Alawad

Computational Sciences and Engineering Division, Health Data Sciences Institute, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA.
M Todd Young
John Gounley

Advanced Computing for Health Sciences, Oak Ridge National Laboratory, Oak Ridge, TN 37830, United States.
Noah Schaefferkoetter

Oak Ridge National Lab, Oak Ridge, TN, USA.
Hong Jun Yoon

Computational Sciences and Engineering Division, Health Data Sciences Institute, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA.
Xiao-Cheng Wu

Department of Epidemiology, Louisiana State University New Orleans School of Public Health, New Orleans, LA 70112, United States.
Eric B Durbin

University of Kentucky, Lexington, KY.
Jennifer Doherty

Utah Cancer Registry, University of Utah School of Medicine, Salt Lake City, UT 84132, United States of America. Electronic address: Jen.Doherty@hci.utah.edu.
Antoinette Stroup

New Jersey State Cancer Registry, Rutgers Cancer Institute of New Jersey, New Brunswick, NJ, 08901, United States of America. Electronic address: nan.stroup@rutgers.edu.
Linda Coyle

Information Management Services Inc, Calverton, Maryland, USA.
Georgia Tourassi

Computational Sciences and Engineering Division, Health Data Sciences Institute, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA.

Keywords

Humans Natural Language Processing Neural Networks, Computer

External Resources

View on PubMed Access via DOI PubMed (33635801)

Limitations of Transformers on Clinical Text Classification.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals

Limitations of Transformers on Clinical Text Classification.

Abstract

Authors

Keywords

External Resources

Don't Miss the Future of Medicine

Popular Topics

Recent Journals