Word2Vec inversion and traditional text classifiers for phenotyping lupus.

Journal: BMC medical informatics and decision making
Published Date:

Abstract

BACKGROUND: Identifying patients with certain clinical criteria based on manual chart review of doctors' notes is a daunting task given the massive amounts of text notes in the electronic health records (EHR). This task can be automated using text classifiers based on Natural Language Processing (NLP) techniques along with pattern recognition machine learning (ML) algorithms. The aim of this research is to evaluate the performance of traditional classifiers for identifying patients with Systemic Lupus Erythematosus (SLE) in comparison with a newer Bayesian word vector method.

Authors

  • Clayton A Turner
    Department of Computer Science, College of Charleston, 66 George Street, Charleston, 29424, USA. caturner3@g.cofc.edu.
  • Alexander D Jacobs
    Department of Computer Science, College of Charleston, 66 George Street, Charleston, 29424, USA.
  • Cassios K Marques
    Department of Computer Science, College of Charleston, 66 George Street, Charleston, 29424, USA.
  • James C Oates
    Department of Public Health Sciences, Medical University of South Carolina, 135 Cannon Street, Charleston, 29425, USA.
  • Diane L Kamen
    Department of Medicine, Medical University of South Carolina, Charleston, SC. Electronic address: ruizda@musc.edu.
  • Paul E Anderson
    Department of Computer Science, College of Charleston, Charleston, SC 29424, USA.
  • Jihad S Obeid
    Biomedical Informatics Center, Medical University of South Carolina, Charleston, SC 29425, United States.