Improving broad-coverage medical entity linking with semantic type prediction and large-scale datasets.
Journal:
Journal of biomedical informatics
Published Date:
Aug 12, 2021
Abstract
OBJECTIVES: Biomedical natural language processing tools are increasingly being applied for broad-coverage information extraction-extracting medical information of all types in a scientific document or a clinical note. In such broad-coverage settings, linking mentions of medical concepts to standardized vocabularies requires choosing the best candidate concepts from large inventories covering dozens of types. This study presents a novel semantic type prediction module for biomedical NLP pipelines and two automatically-constructed, large-scale datasets with broad coverage of semantic types.