Development of a Method for Automatic Matching of Unstructured Medical Data to ICD-10 Codes.

Journal: Studies in health technology and informatics
Published Date:

Abstract

Inconsistent disease coding standards in medicine create hurdles in data exchange and analysis. This paper proposes a machine learning system to address this challenge. The system automatically matches unstructured medical text (doctor notes, complaints) to ICD-10 codes. It leverages a unique architecture featuring a training layer for model development and a knowledge base that captures relationships between symptoms and diseases. Experiments using data from a large medical research center demonstrated the system's effectiveness in disease classification prediction. Logistic regression emerged as the optimal model due to its superior processing speed, achieving an accuracy of 81.07% with acceptable error rates during high-load testing. This approach offers a promising solution to improve healthcare informatics by overcoming coding standard incompatibility and automating code prediction from unstructured medical text.

Authors

  • Bogdan Volkov
    ITMO University, Saint-Petersburg, Russia.
  • Georgy Kopanitsa
    Institute Cybernetic Center, Tomsk Polytechnic University, Tomsk, Russia.