Classification-by-Analogy: Using Vector Representations of Implicit Relationships to Identify Plausibly Causal Drug/Side-effect Relationships.

Journal: AMIA ... Annual Symposium proceedings. AMIA Symposium
Published Date:

Abstract

An important aspect of post-marketing drug surveillance involves identifying potential side-effects utilizing adverse drug event (ADE) reporting systems and/or Electronic Health Records. These data are noisy, necessitating identified drug/ADE associations be manually reviewed - a human-intensive process that scales poorly with large numbers of possibly dangerous associations and rapid growth of biomedical literature. Recent work has employed Literature Based Discovery methods that exploit implicit relationships between biomedical entities within the literature to estimate the plausibility of drug/ADE connections. We extend this work by evaluating machine learning classifiers applied to high-dimensional vector representations of relationships extracted from the literature as a means to identify substantiated drug/ADE connections. Using a curated reference standard, we show applying classifiers to such representations improves performance (+≈37%AUC) over previous approaches. These trained systems reproduce outcomes of the manual literature review process used to create the reference standard, but further research is required to establish their generalizability.

Authors

  • Justin Mower
    Baylor College of Medicine, Houston, Texas;; University of Texas Health Science Center at Houston, Houston, Texas.
  • Devika Subramanian
    Rice University, Houston, Texas.
  • Ning Shang
    Columbia University, New York, New York.
  • Trevor Cohen
    University of Washington, Seattle, WA.