Data-Driven Identification of Clinical Real-World Expressions Linked to ICD.
Journal:
Studies in health technology and informatics
Published Date:
May 18, 2023
Abstract
A semi-structured clinical problem list containing ∼1.9 million de-identified entries linked to ICD-10 codes was used to identify closely related real-world expressions. A log-likelihood based co-occurrence analysis generated seed-terms, which were integrated as part of a k-NN search, by leveraging SapBERT for the generation of an embedding representation.