Natural language processing and machine learning to enable automatic extraction and classification of patients' smoking status from electronic medical records.
Journal:
Upsala journal of medical sciences
PMID:
32696698
Abstract
BACKGROUND: The electronic medical record (EMR) offers unique possibilities for clinical research, but some important patient attributes are not readily available due to its unstructured properties. We applied text mining using machine learning to enable automatic classification of unstructured information on smoking status from Swedish EMR data.
Authors
Keywords
Algorithms
Automation
Bayes Theorem
Data Mining
Electronic Health Records
False Positive Reactions
Humans
Machine Learning
Medical Informatics
Natural Language Processing
Observer Variation
Pattern Recognition, Automated
Reproducibility of Results
Research Design
ROC Curve
Smoking
Software
Support Vector Machine
Sweden
Tobacco Use Disorder