Multiple features for clinical relation extraction: A machine learning approach.

Journal: Journal of biomedical informatics

Published Date: Feb 3, 2020

Abstract

Relation extraction aims to discover relational facts about entity mentions from plain texts. In this work, we focus on clinical relation extraction; namely, given a medical record with mentions of drugs and their attributes, we identify relations between these entities. We propose a machine learning model with a novel set of knowledge-based and BioSentVec embedding features. We systematically investigate the impact of these features with standard distance- and word-based features, conducting experiments on two benchmark datasets of clinical texts from MADE 2018 and n2c2 2018 shared tasks. For comparison with the feature-based model, we utilize state-of-the-art models and three BERT-based models, including BioBERT and Clinical BERT. Our results demonstrate that distance and word features provide significant benefits to the classifier. Knowledge-based features improve classification results only for particular types of relations. The sentence embedding feature provides the largest improvement in results, among other explored features on the MADE corpus. The classifier obtains state-of-the-art performance in clinical relation extraction with F-measure of 92.6%, improving F-measure by 3.5% on the MADE corpus.

Authors

Ilseyar Alimova

Kazan Federal University, 18 Kremlyovskaya Street, Kazan 420008, Russian Federation. Electronic address: alimovailseyar@gmail.com.
Elena Tutubalina

Kazan (Volga Region) Federal University, Kazan, Russia.

Keywords

Knowledge Bases Language Machine Learning Natural Language Processing

External Resources

View on PubMed Access via DOI PubMed (32028051)

Multiple features for clinical relation extraction: A machine learning approach.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals