Automated Classification of Pathology Reports.

Journal: Studies in health technology and informatics
Published Date:

Abstract

This work develops an automated classifier of pathology reports which infers the topography and the morphology classes of a tumor using codes from the International Classification of Diseases for Oncology (ICD-O). Data from 94,980 patients of the A.C. Camargo Cancer Center was used for training and validation of Naive Bayes classifiers, evaluated by the F1-score. Measures greater than 74% in the topographic group and 61% in the morphologic group are reported. Our work provides a successful baseline for future research for the classification of medical documents written in Portuguese and in other domains.

Authors

  • Michel Oleynik
    Institute of Mathematics and Statistics, University of São Paulo, São Paulo, Brazil.
  • Marcelo Finger
    Institute of Mathematics and Statistics, University of São Paulo, São Paulo, Brazil.
  • Diogo F C Patrão
    International Center for Research, A. C. Camargo Cancer Center, São Paulo, Brazil.