An Empirical Method of Automatic Pattern Extraction for Clinical Text Classification.

Journal: Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference
Published Date:

Abstract

Clinical text classification is an indispensable and extensively studied problem in medical text processing. Existing research primarily employs machine learning and pattern based approaches to address the stated problem. In general, pattern based approaches perform better than other methods. However, these approaches commonly require human intervention for pattern identification, which diminish their benefits and restrain their applications. In this study, we present a novel pattern extraction algorithm, which identifies and extracts patterns from clinical textual resources, automatically. The algorithm identifies the candidate concepts in the clinical text, finds the context of the concepts by discovering their context windows, and finally transforms each context window to a pattern. We evaluate our proposed algorithm on Hypertension, Rhinosinusitis, and Asthma guidelines. 70% of the hypertension guideline was used for pattern extraction while the remaining 30% and the other two guidelines were used for evaluations. The algorithm extracts 21 patterns that classify Hypertension, Rhinosinusitis, and Asthma guidelines sentences to the recommendation and non-recommendation sentences with 84.53%, 80.03%, and 84.62% accuracy, respectively. The initial results reveal the benefits and applicability of the algorithm for clinical text classification.

Authors

  • Musarrat Hussain
    Department of Computer Science and Engineering, Kyung Hee University, Yongin, Korea.
  • Jamil Hussain
    Department of Computer Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu Yongin-si, Gyeonggi-do 446-701, Korea. jamil@oslab.khu.ac.kr.
  • Taqdir Ali
    Department of Computer Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu Yongin-si, Gyeonggi-do 446-701, Korea. taqdir.ali@oslab.khu.ac.kr.
  • Sungyoung Lee
    Department of Computer Science and Engineering, Kyung Hee University, Yongin, Korea.