Secondary use of electronic health records for building cohort studies through top-down information extraction.

Journal: Journal of biomedical informatics

Published Date: Nov 21, 2014

Abstract

Controlled clinical trials are usually supported with an in-front data aggregation system, which supports the storage of relevant information according to the trial context within a highly structured environment. In contrast to the documentation of clinical trials, daily routine documentation has many characteristics that influence data quality. One such characteristic is the use of non-standardized text, which is an indispensable part of information representation in clinical information systems. Based on a cohort study we highlight challenges for mining electronic health records targeting free text entry fields within semi-structured data sources. Our prototypical information extraction system achieved an F-measure of 0.91 (precision=0.90, recall=0.93) for the training set and an F-measure of 0.90 (precision=0.89, recall=0.92) for the test set. We analyze the obtained results in detail and highlight challenges and future directions for the secondary use of routine data in general.

Authors

Markus Kreuzthaler

Institute of Medical Informatics, Statistics, and Documentation, Medical University of Graz, Austria.
Stefan Schulz

Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Austria.
Andrea Berghold

Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Austria.

Keywords

Algorithms Cohort Studies Computer Graphics Data Mining Databases, Factual Electronic Health Records Europe Humans Information Storage and Retrieval Medical Informatics Natural Language Processing Reproducibility of Results Research Design Software United States User-Computer Interface

External Resources

View on PubMed Access via DOI PubMed (25451102)

Secondary use of electronic health records for building cohort studies through top-down information extraction.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals