A disease inference method based on symptom extraction and bidirectional Long Short Term Memory networks.

Journal: Methods (San Diego, Calif.)
Published Date:

Abstract

The wide applications of automatic disease inference in many medical fields improve the efficiency of medical treatments. Many efforts have been made to predict patients' future health conditions according to their full clinical texts, clinical measurements or medical codes. Symptoms reflect the onset of diseases and can provide credible information for disease diagnosis. In this study, we propose a new disease inference method by extracting symptoms and integrating two symptom representation approaches. To reduce the uncertainty and irregularity of symptom descriptions in Electronic Medical Records (EMR), a comprehensive clinical knowledge database consisting of massive amount of data about diseases, symptoms, and their relationships, we extract symptoms with existing nature language process tool Metamap which is designed for biomedical texts. To take advantages of the complex relationship between symptoms and diseases to enhance the accuracy of disease inference, we present two symptom representation models: term frequency-inverse document frequency (TF-IDF) model for the representation of the relationship between symptoms and diseases and Word2Vec for the expression of the semantic relationship between symptoms. Based on these two symptom representations, we employ the bidirectional Long Short Term Memory networks (BiLSTMs) to model symptom sequences in EMR. Our proposed model shows a significant improvement in term of AUC (0.895) and F1 (0.572) for 50 diseases in MIMIC-III dataset. The results illustrate that the model with the combination of the two symptom representations perform better than the one with only one of them.

Authors

  • Donglin Guo
    School of Computer Science and Engineering, Central South University, Changsha, China.
  • Guihua Duan
    School of Computer Science and Engineering, Central South University, Changsha, China.
  • Ying Yu
    School of Chemistry and Environment, Guangzhou Key Laboratory of Analytical Chemistry for Biomedicine, South China Normal University, Guangzhou 510006, PR China. Electronic address: yuyhs@scnu.edu.cn.
  • Yaohang Li
  • Fang-Xiang Wu
  • Min Li
    Hubei Provincial Institute for Food Supervision and Test, Hubei Provincial Engineering and Technology Research Center for Food Quality and Safety Test, Wuhan 430075, China.