Comparison of text processing methods in social media-based signal detection.

Journal: Pharmacoepidemiology and drug safety
Published Date:

Abstract

PURPOSE: Adverse event (AE) identification in social media (SM) can be performed using various types of natural language processing (NLP) and machine learning (ML). These methods can be categorized by complexity and precision level. Co-occurrence-based ML methods are rather basic, as they identify simultaneous appearance of drugs and clinical events in a single post. In contrast, statistical learning methods involve more complex NLP and identify drugs, events, and associations between them. We aimed to compare the ability of co-occurrence and NLP to identify AEs and signals of disproportionate reporting (SDR) in patient-generated SM. We also examined the performance of lift in SM-based signal detection (SD).

Authors

  • Natalie Gavrielov-Yusim
    R&D, Data2Life, Tel Aviv, Israel.
  • Marie-Laure Kürzinger
    Epidemiology and Benefit Risk Evaluation, Sanofi, Chilly-Mazarin, France.
  • Chihiro Nishikawa
    Epidemiology and Benefit Risk Evaluation, Sanofi, Chilly-Mazarin, France.
  • Chunshen Pan
    Epidemiology and Benefit Risk Evaluation, Sanofi, Bridgewater, NJ, USA.
  • Julie Pouget
    Information Technology and Solutions, R&D CMO - SC Real World Evidence, Sanofi, Lyon, France.
  • Limor Bh Epstein
    R&D, Data2Life, Tel Aviv, Israel.
  • Yan Golant
    R&D, Data2Life, Tel Aviv, Israel.
  • Stephanie Tcherny-Lessenot
    Epidemiology and Benefit Risk Evaluation, Sanofi, Chilly-Mazarin, France.
  • Stephen Lin
  • Bernard Hamelin
    Medical Evidence Generation, Sanofi, Paris, France.
  • Juhaeri Juhaeri
    Epidemiology and Benefit Risk Evaluation, Sanofi, Bridgewater, NJ, USA.