Machine Learning for Automatic Encoding of French Electronic Medical Records: Is More Data Better?

Journal: Studies in health technology and informatics

Published Date: Jun 16, 2020

Abstract

The encoding of Electronic Medical Records is a complex and time-consuming task. We report on a machine learning model for proposing diagnoses and procedures codes, from a large realistic dataset of 245 000 electronic medical records at the University Hospitals of Geneva. Our study particularly focuses on the impact of training data quantity on the model's performances. We show that the performances of the models do not increase while encoded instances from previous years are exploited for learning data. Furthermore, supervised models are shown to be highly perishable: we show a potential drop in performances of around -10% per year. Consequently, great and constant care must be exercised for designing and updating the content of such knowledge bases exploited by machine learning.

Authors

Julien Gobeill

BiTeM Group, Information Science Department, University of Applied Sciences of Western Switzerland (HES-SO, HEG), Switzerland.
Patrick Ruch

BiTeM Group, Information Science Department, University of Applied Sciences of Western Switzerland (HES-SO, HEG), Switzerland.
Rodolphe Meyer

Information Systems Department, University Hospitals of Geneva (HUG), Geneva, Switzerland.

Keywords

Electronic Health Records Machine Learning

External Resources

View on PubMed Access via DOI PubMed (32570397)

Machine Learning for Automatic Encoding of French Electronic Medical Records: Is More Data Better?

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals

Machine Learning for Automatic Encoding of French Electronic Medical Records: Is More Data Better?

Abstract

Authors

Keywords

External Resources

Don't Miss the Future of Medicine

Popular Topics

Recent Journals