Developing and testing a framework for coding general practitioners' free-text diagnoses in electronic medical records - a reliability study for generating training data in natural language processing.
Journal:
BMC primary care
Published Date:
Jul 16, 2024
Abstract
BACKGROUND: Diagnoses entered by general practitioners into electronic medical records have great potential for research and practice, but unfortunately, diagnoses are often in uncoded format, making them of little use. Natural language processing (NLP) could assist in coding free-text diagnoses, but NLP models require local training data to unlock their potential. The aim of this study was to develop a framework of research-relevant diagnostic codes, to test the framework using free-text diagnoses from a Swiss primary care database and to generate training data for NLP modelling.