A disease inference method based on symptom extraction and bidirectional Long Short Term Memory networks.

Journal: Methods (San Diego, Calif.)

Published Date: Jul 10, 2019

Abstract

The wide applications of automatic disease inference in many medical fields improve the efficiency of medical treatments. Many efforts have been made to predict patients' future health conditions according to their full clinical texts, clinical measurements or medical codes. Symptoms reflect the onset of diseases and can provide credible information for disease diagnosis. In this study, we propose a new disease inference method by extracting symptoms and integrating two symptom representation approaches. To reduce the uncertainty and irregularity of symptom descriptions in Electronic Medical Records (EMR), a comprehensive clinical knowledge database consisting of massive amount of data about diseases, symptoms, and their relationships, we extract symptoms with existing nature language process tool Metamap which is designed for biomedical texts. To take advantages of the complex relationship between symptoms and diseases to enhance the accuracy of disease inference, we present two symptom representation models: term frequency-inverse document frequency (TF-IDF) model for the representation of the relationship between symptoms and diseases and Word2Vec for the expression of the semantic relationship between symptoms. Based on these two symptom representations, we employ the bidirectional Long Short Term Memory networks (BiLSTMs) to model symptom sequences in EMR. Our proposed model shows a significant improvement in term of AUC (0.895) and F1 (0.572) for 50 diseases in MIMIC-III dataset. The results illustrate that the model with the combination of the two symptom representations perform better than the one with only one of them.

Authors

Donglin Guo

School of Computer Science and Engineering, Central South University, Changsha, China.
Guihua Duan

School of Computer Science and Engineering, Central South University, Changsha, China.
Ying Yu

School of Chemistry and Environment, Guangzhou Key Laboratory of Analytical Chemistry for Biomedicine, South China Normal University, Guangzhou 510006, PR China. Electronic address: yuyhs@scnu.edu.cn.
Yaohang Li
Fang-Xiang Wu
Min Li

Hubei Provincial Institute for Food Supervision and Test, Hubei Provincial Engineering and Technology Research Center for Food Quality and Safety Test, Wuhan 430075, China.

Keywords

Algorithms Electronic Health Records Humans Natural Language Processing Neural Networks, Computer Semantics

External Resources

View on PubMed Access via DOI PubMed (31301375)

A disease inference method based on symptom extraction and bidirectional Long Short Term Memory networks.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals