Development and Validation of a Natural Language Processing Algorithm for Extracting Clinical and Pathological Features of Breast Cancer From Pathology Reports.

Journal: JCO clinical cancer informatics
Published Date:

Abstract

PURPOSE: Electronic health records (EHRs) are valuable information repositories that offer insights for enhancing clinical research on breast cancer (BC) using real-world data. The objective of this study was to develop a natural language processing (NLP) model specifically designed to extract structured data from BC pathology reports written in natural language.

Authors

  • Elisabetta Munzone
    Division of Medical Senology, European Institute of Oncology IRCCS, Milan, Italy.
  • Antonio Marra
    Division of Early Drug Development for Innovative Therapies, European Institute of Oncology IRCCS, Milan, Italy.
  • Federico Comotto
    Reply S.p.A., Turin, Italy.
  • Lorenzo Guercio
    Reply S.p.A., Turin, Italy.
  • Claudia Anna Sangalli
    Clinical Trial Office, European Institute of Oncology IRCCS, Milan, Italy.
  • Martina Lo Cascio
    Central Management of Information Systems and Technologies, European Institute of Oncology IRCCS, Milan, Italy.
  • Eleonora Pagan
    Department of Statistics and Quantitative Methods, University of Milan-Bicocca, Milan, Italy.
  • Davide Sangalli
    Central Management of Information Systems and Technologies, European Institute of Oncology IRCCS, Milan, Italy.
  • Ilaria Bigoni
    Reply S.p.A., Turin, Italy.
  • Francesca Maria Porta
    Division of Pathology, European Institute of Oncology IRCCS, Milan, Italy.
  • Marianna D'Ercole
    Division of Pathology, European Institute of Oncology IRCCS, Milan, Italy.
  • Fabiana Ritorti
    Reply S.p.A., Turin, Italy.
  • Vincenzo Bagnardi
    Department of Statistics and Quantitative Methods, University of Milan-Bicocca, Milan, Italy.
  • Nicola Fusco
    Biobank for Translational and Digital Medicine Unit, Division of Pathology, IEO, European Institute of Oncology IRCCS, University of Milan, Milan, 20141, Italy.
  • Giuseppe Curigliano
    Division of New Drugs and Early Drug Development for Innovative Therapies, European Institute of Oncology, IRCCS, Milan, Italy; Department of Oncology and Haematology (DIPO), University of Milan, Milan, Italy.