Boosting ICD multi-label classification of health records with contextual embeddings and label-granularity.
Journal:
Computer methods and programs in biomedicine
Published Date:
Dec 10, 2019
Abstract
BACKGROUND AND OBJECTIVE: This work deals with clinical text mining, a field of Natural Language Processing applied to biomedical informatics. The aim is to classify Electronic Health Records with respect to the International Classification of Diseases, which is the foundation for the identification of international health statistics, and the standard for reporting diseases and health conditions. Within the framework of data mining, the goal is the multi-label classification, as each health record has assigned multiple International Classification of Diseases codes. We investigate five Deep Learning architectures with a dataset obtained from the Basque Country Health System, and six different perspectives derived from shifts in the input and the output.