Machine learning classification of surgical pathology reports and chunk recognition for information extraction noise reduction.

Journal: Artificial intelligence in medicine
Published Date:

Abstract

BACKGROUND AND AIMS: Machine learning techniques for the text mining of cancer-related clinical documents have not been sufficiently explored. Here some techniques are presented for the pre-processing of free-text breast cancer pathology reports, with the aim of facilitating the extraction of information relevant to cancer staging.

Authors

  • Giulio Napolitano
    Institut für Medizinische Biometrie, Informatik und Epidemiologie (IMBIE), Universität Bonn, Haus 325/11/1.OG/Raum 620, Sigmund-Freud-Straße 25, 53105 Bonn, Germany. Electronic address: g.napolitano@imbie.uni-bonn.de.
  • Adele Marshall
    Queen's University Belfast, School of Mathematics and Physics, University Road, Belfast BT7 1NN, United Kingdom.
  • Peter Hamilton
    Queen's University Belfast, School of Medicine, Dentistry and Biomedical Sciences, 97 Lisburn Road, Belfast BT9 7BL, United Kingdom.
  • Anna T Gavin
    NICR-Centre for Public Health, The Queen's University of Belfast, Mulhouse Building, Grosvenor Road, Belfast BT12 6DP, United Kingdom.