CLIN-X: pre-trained language models and a study on cross-task transfer for concept extraction in the clinical domain.
Journal:
Bioinformatics (Oxford, England)
Published Date:
Jun 13, 2022
Abstract
MOTIVATION: The field of natural language processing (NLP) has recently seen a large change toward using pre-trained language models for solving almost any task. Despite showing great improvements in benchmark datasets for various tasks, these models often perform sub-optimal in non-standard domains like the clinical domain where a large gap between pre-training documents and target documents is observed. In this article, we aim at closing this gap with domain-specific training of the language model and we investigate its effect on a diverse set of downstream tasks and settings.