Leveraging Rule-Based NLP to Translate Textual Reports as Structured Inputs Automatically Processed by a Clinical Decision Support System.

Journal: Studies in health technology and informatics
Published Date:

Abstract

Using clinical decision support systems (CDSSs) for breast cancer management necessitates to extract relevant patient data from textual reports which is a complex task although efficiently achieved by machine learning but black box methods. We proposed a rule-based natural language processing (NLP) method to automate the translation of breast cancer patient summaries into structured patient profiles suitable for input into the guideline-based CDSS of the DESIREE project. Our method encompasses named entity recognition (NER), relation extraction and structured data extraction to systematically organize patient data. The method demonstrated strong alignment with treatment recommendations generated for manually created patient profiles (gold standard) with only 2% of differences. Moreover, the NER pipeline achieved an average F1-score of 0.9 across the main entities (patient, side, and tumor), of 0,87 for relation extraction, and 0.75 for contextual information, showing promising results for rule-based NLP.

Authors

  • Akram Redjdal
    Sorbonne Université, Université Sorbonne Paris Nord, Inserm, UMR S_1142, LIMICS, Paris, France.
  • Natallia Novikava
    Sorbonne Université, Université Sorbonne Paris Nord, INSERM, LIMICS, Paris, France.
  • Emmanuelle Kempf
    Assistance Publique-Hôpitaux de Paris, Henri Mondor-Albert Chenevier University Hospital, Department of Medical Oncology, Créteil, France.
  • Jacques Bouaud
    AP-HP, DRCD, Paris, France.
  • Brigitte Seroussi
    Sorbonne Universités, UPMC Université Paris 06, UMR_S 1142, LIMICS, Paris, France.