Multiple sampling schemes and deep learning improve active learning performance in drug-drug interaction information retrieval analysis from the literature.

Journal: Journal of biomedical semantics
Published Date:

Abstract

BACKGROUND: Drug-drug interaction (DDI) information retrieval (IR) is an important natural language process (NLP) task from the PubMed literature. For the first time, active learning (AL) is studied in DDI IR analysis. DDI IR analysis from PubMed abstracts faces the challenges of relatively small positive DDI samples among overwhelmingly large negative samples. Random negative sampling and positive sampling are purposely designed to improve the efficiency of AL analysis. The consistency of random negative sampling and positive sampling is shown in the paper.

Authors

  • Weixin Xie
    Institute of Intelligent System and Bioinformatics, College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin, China.
  • Kunjie Fan
    Department of Biomedical Informatics of The Ohio State University, 43202 Columbus, OH, USA.
  • Shijun Zhang
    Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, 43210, USA.
  • Lang Li
    Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, 43210, USA.