Smoking Status Normalization with Cross-Encoders and SNOMED CT.
Journal:
Studies in health technology and informatics
Published Date:
May 15, 2025
Abstract
Accurately documenting smoking status is essential for clinical decision-making and patient care. However, smoking status information is often only available in clinical narratives. Mapping smoking-related terms to standardized terminologies such as SNOMED CT enhances interoperability and consistency across healthcare systems. We employed a bi-encoder and cross-encoder re-ranking model to normalize possible mentions of smoking status in clinical narratives by assigning SNOMED CT codes, achieving standardized representations. Our investigation achieved 85% accuracy for Recall@1, successfully mapping smoking-related narrative expressions to SNOMED CT definitions in German.