Use of Natural Language Processing for Precise Retrieval of Key Elements of Health IT Evaluation Studies.

Journal: Studies in health technology and informatics
Published Date:

Abstract

Having precise information about health IT evaluation studies is important for evidence-based decisions in medical informatics. In a former feasibility study, we used a faceted search based on ontological modeling of key elements of studies to retrieve precisely described health IT evaluation studies. However, extracting the key elements manually for the modeling of the ontology was time and resource-intensive. We now aimed at applying natural language processing to substitute manual data extraction by automatic data extraction. Four methods (Named Entity Recognition, Bag-of-Words, Term-Frequency-Inverse-Document-Frequency, and Latent Dirichlet Allocation Topic Modeling were applied to 24 health IT evaluation studies. We evaluated which of these methods was best suited for extracting key elements of each study. As gold standard, we used results from manual extraction. As a result, Named Entity Recognition is promising but needs to be adapted to the existing study context. After the adaption, key elements of studies could be collected in a more feasible, time- and resource-saving way.

Authors

  • Verena Dornauer
    Institute of Medical Informatics, UMIT TIROL - Private University for Health Sciences and Health Technology, Eduard Wallnöfer Zentrum 1, Hall in Tirol, 6060 Austria.
  • Franziska Jahn
    IMISE, University of Leipzig, Härtelstraße 16-18, 04107 Leipzig, Germany.
  • Konrad Hoeffner
    Institute of Medical Informatics, Statistics and Epidemiology, University of Leipzig, Germany.
  • Alfred Winter
    IMISE, University of Leipzig, Härtelstraße 16-18, 04107 Leipzig, Germany.
  • Elske Ammenwerth
    Institute of Medical Informatics, UMIT TIROL - Private University for Health Sciences and Health Technology, Eduard Wallnöfer Zentrum 1, Hall in Tirol, 6060 Austria.