Functional evaluation of out-of-the-box text-mining tools for data-mining tasks.

Journal: Journal of the American Medical Informatics Association : JAMIA
Published Date:

Abstract

OBJECTIVE: The trade-off between the speed and simplicity of dictionary-based term recognition and the richer linguistic information provided by more advanced natural language processing (NLP) is an area of active discussion in clinical informatics. In this paper, we quantify this trade-off among text processing systems that make different trade-offs between speed and linguistic understanding. We tested both types of systems in three clinical research tasks: phase IV safety profiling of a drug, learning adverse drug-drug interactions, and learning used-to-treat relationships between drugs and indications.

Authors

  • Kenneth Jung
    Program in Biomedical Informatics, Stanford University, Stanford, California, USA.
  • Paea LePendu
    Center for Biomedical Informatics Research, Stanford University, Stanford, California, USA.
  • Srinivasan Iyer
    Center for Biomedical Informatics Research, Stanford University, Stanford, California, USA.
  • Anna Bauer-Mehren
    Center for Biomedical Informatics Research, Stanford University, Stanford, California, USA.
  • Bethany Percha
    Program in Biomedical Informatics, Stanford University, Stanford, California, USA.
  • Nigam H Shah
    Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA, USA.