Automatic classification of diseases from free-text death certificates for real-time surveillance.

Journal: BMC medical informatics and decision making
PMID:

Abstract

BACKGROUND: Death certificates provide an invaluable source for mortality statistics which can be used for surveillance and early warnings of increases in disease activity and to support the development and monitoring of prevention or response strategies. However, their value can be realised only if accurate, quantitative data can be extracted from death certificates, an aim hampered by both the volume and variable nature of certificates written in natural language. This study aims to develop a set of machine learning and rule-based methods to automatically classify death certificates according to four high impact diseases of interest: diabetes, influenza, pneumonia and HIV.

Authors

  • Bevan Koopman
    Australian e-Health Research Centre, CSIRO, Brisbane, QLD, Australia; Queensland University of Technology, Brisbane, QLD, Australia.
  • Sarvnaz Karimi
    Australian e-Health Research Centre, CSIRO, Royal Brisbane and Women's Hospital, Brisbane, Australia.
  • Anthony Nguyen
    Australian e-Health Research Centre, CSIRO, Brisbane, QLD, Australia.
  • Rhydwyn McGuire
    NSW Ministry of Health, North Sydney, Sydney, Australia.
  • David Muscatello
    NSW Ministry of Health, North Sydney, Sydney, Australia.
  • Madonna Kemp
    Australian e-Health Research Centre, CSIRO, Royal Brisbane and Women's Hospital, Brisbane, Australia.
  • Donna Truran
    Australian e-Health Research Centre, CSIRO, Royal Brisbane and Women's Hospital, Brisbane, Australia.
  • Ming Zhang
    Heilongjiang Key Laboratory for Laboratory Animals and Comparative Medicine, College of Veterinary Medicine, Harbin 150030, China.
  • Sarah Thackway
    NSW Ministry of Health, North Sydney, Sydney, Australia.