Development of a Method for Automatic Matching of Unstructured Medical Data to ICD-10 Codes.
Journal:
Studies in health technology and informatics
Published Date:
May 23, 2024
Abstract
Inconsistent disease coding standards in medicine create hurdles in data exchange and analysis. This paper proposes a machine learning system to address this challenge. The system automatically matches unstructured medical text (doctor notes, complaints) to ICD-10 codes. It leverages a unique architecture featuring a training layer for model development and a knowledge base that captures relationships between symptoms and diseases. Experiments using data from a large medical research center demonstrated the system's effectiveness in disease classification prediction. Logistic regression emerged as the optimal model due to its superior processing speed, achieving an accuracy of 81.07% with acceptable error rates during high-load testing. This approach offers a promising solution to improve healthcare informatics by overcoming coding standard incompatibility and automating code prediction from unstructured medical text.