Comparative Analysis of ChatGPT-4 for Automated Mapping of Local Medical Terminologies to SNOMED CT.

Journal: Studies in health technology and informatics
Published Date:

Abstract

Standardizing medical terminology is critical for healthcare informatics, particularly for improving data interoperability and patient management systems. This study evaluated four distinct GPT-4-based approaches for mapping local medical terminologies to SNOMED CT: baseline, prompt-engineered, fine-tuned, and Retrieval-Augmented Generation (RAG). Using 1,200 diagnostic terms from a Korean hospital, we assessed the models' accuracy and error rates. The RAG model achieved the highest performance with a 96.2% valid SNOMED CT term match rate and a 57.6% overall exact match rate, surpassing the fine-tuned model (69.2% valid term match, 47.2% exact match). Error analysis showed that the RAG model also minimized structural errors to 14%, significantly lower than other models. While the fine-tuned and RAG models struggled with specificity, they showed promise for improving automated mapping and Named Entity Recognition (NER) tasks in clinical settings. This study highlights the potential of AI-human collaboration for enhancing autocoding and data standardization in healthcare. Further research is needed to refine specificity and validate these systems for clinical use.

Authors

  • Sookyung Huh
    Department of medical recorder's team, Severance Hospital, Seoul, Korea.