Do Neural Information Extraction Algorithms Generalize Across Institutions?
Journal:
JCO clinical cancer informatics
Published Date:
Jul 1, 2019
Abstract
PURPOSE: Natural language processing (NLP) techniques have been adopted to reduce the curation costs of electronic health records. However, studies have questioned whether such techniques can be applied to data from previously unseen institutions. We investigated the performance of a common neural NLP algorithm on data from both known and heldout (ie, institutions whose data were withheld from the training set and only used for testing) hospitals. We also explored how diversity in the training data affects the system's generalization ability.