Entity-enhanced BERT for medical specialty prediction based on clinical questionnaire data.

Journal: PloS one
PMID:

Abstract

A medical specialty prediction system for remote diagnosis can reduce the unexpected costs incurred by first-visit patients who visit the wrong hospital department for their symptoms. To develop medical specialty prediction systems, several researchers have explored clinical predictive models using real medical text data. Medical text data include large amounts of information regarding patients, which increases the sequence length. Hence, a few studies have attempted to extract entities from the text as concise features and provide domain-specific knowledge for clinical text classification. However, it is still insufficient to inject them into the model effectively. Thus, we propose Entity-enhanced BERT (E-BERT), which utilizes the structural attributes of BERT for medical specialty prediction. E-BERT has an entity embedding layer and entity-aware attention to inject domain-specific knowledge and focus on relationships between medical-related entities within the sequences. Experimental results on clinical questionnaire data demonstrate the superiority of E-BERT over the other benchmark models, regardless of the input sequence length. Moreover, the visualization results for the effects of entity-aware attention prove that E-BERT effectively incorporate domain-specific knowledge and other information, enabling the capture of contextual information in the text. Finally, the robustness and applicability of the proposed method is explored by applying it to other Pre-trained Language Models. These effective medical specialty predictive model can provide practical information to first-visit patients, resulting in streamlining the diagnostic process and improving the quality of medical consultations.

Authors

  • Soyeon Lee
    School of Industrial and Management Engineering, Korea University, Seongbuk-gu, Seoul, Republic of Korea.
  • Ye Ji Han
    School of Industrial and Management Engineering, Korea University, Seongbuk-gu, Seoul, Republic of Korea.
  • Hyun Joon Park
    School of Industrial and Management Engineering, Korea University, Seongbuk-gu, Seoul, Republic of Korea.
  • Byung Hoon Lee
    School of Industrial and Management Engineering, Korea University, Seongbuk-gu, Seoul, Republic of Korea.
  • DaHee Son
    People's Health Co., Ltd., 403, BT-IT Convergence Center, Seongbuk-gu, Seoul, Republic of Korea.
  • Soyeon Kim
    Department of Pediatrics, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, United States.
  • HyeonJong Yang
    People's Health Co., Ltd., 403, BT-IT Convergence Center, Seongbuk-gu, Seoul, Republic of Korea.
  • TaeJun Han
    People's Health Co., Ltd., 403, BT-IT Convergence Center, Seongbuk-gu, Seoul, Republic of Korea.
  • Eunsun Kim
    Department of Data Science, Sejong University, Seoul, Korea.
  • Sung Won Han
    Department of Industrial and Management Engineering, Korea University, Seoul 02841, the Republic of Korea.