Automatic classification of diseases from free-text death certificates for real-time surveillance.
Journal:
BMC medical informatics and decision making
PMID:
26174442
Abstract
BACKGROUND: Death certificates provide an invaluable source for mortality statistics which can be used for surveillance and early warnings of increases in disease activity and to support the development and monitoring of prevention or response strategies. However, their value can be realised only if accurate, quantitative data can be extracted from death certificates, an aim hampered by both the volume and variable nature of certificates written in natural language. This study aims to develop a set of machine learning and rule-based methods to automatically classify death certificates according to four high impact diseases of interest: diabetes, influenza, pneumonia and HIV.