Clinical Application of Detecting COVID-19 Risks: A Natural Language Processing Approach.

Journal: Viruses
Published Date:

Abstract

The clinical application of detecting COVID-19 factors is a challenging task. The existing named entity recognition models are usually trained on a limited set of named entities. Besides clinical, the non-clinical factors, such as social determinant of health (SDoH), are also important to study the infectious disease. In this paper, we propose a generalizable machine learning approach that improves on previous efforts by recognizing a large number of clinical risk factors and SDoH. The novelty of the proposed method lies in the subtle combination of a number of deep neural networks, including the BiLSTM-CNN-CRF method and a transformer-based embedding layer. Experimental results on a cohort of COVID-19 data prepared from PubMed articles show the superiority of the proposed approach. When compared to other methods, the proposed approach achieves a performance gain of about 1-5% in terms of macro- and micro-average F1 scores. Clinical practitioners and researchers can use this approach to obtain accurate information regarding clinical risks and SDoH factors, and use this pipeline as a tool to end the pandemic or to prepare for future pandemics.

Authors

  • Syed Raza Bashir
    Department of Computer Science, Toronto Metropolitan University, Toronto, ON M5B 2K3, Canada.
  • Shaina Raza
    Dalla Lana School of Public Health, University of Toronto, Toronto, ON M5T 3M7, Canada.
  • Veysel Kocaman
    Data Science, John Snow Labs Inc., Lewes, DE 19958, USA.
  • Urooj Qamar
    Institute of Business & Information Technology, University of the Punjab, Lahore 54590, Pakistan.