Medication event extraction in clinical notes: Contribution of the WisPerMed team to the n2c2 2022 challenge.

Journal: Journal of biomedical informatics
Published Date:

Abstract

In this work, we describe the findings of the 'WisPerMed' team from their participation in Track 1 (Contextualized Medication Event Extraction) of the n2c2 2022 challenge. We tackle two tasks: (i) medication extraction, which involves extracting all mentions of medications from the clinical notes, and (ii) event classification, which involves classifying the medication mentions based on whether a change in the medication has been discussed. To address the long lengths of clinical texts, which often exceed the maximum token length that models based on the transformer-architecture can handle, various approaches, such as the use of ClinicalBERT with a sliding window approach and Longformer-based models, are employed. In addition, domain adaptation through masked language modeling and preprocessing steps such as sentence splitting are utilized to improve model performance. Since both tasks were treated as named entity recognition (NER) problems, a sanity check was performed in the second release to eliminate possible weaknesses in the medication detection itself. This check used the medication spans to remove false positive predictions and replace missed tokens with the highest softmax probability of the disposition types. The effectiveness of these approaches is evaluated through multiple submissions to the tasks, as well as with post-challenge results, with a focus on the DeBERTa v3 model and its disentangled attention mechanism. Results show that the DeBERTa v3 model performs well in both the NER task and the event classification task.

Authors

  • Henning Schäfer
    Department of Computer Science, University of Applied Sciences and Arts Dortmund (FH Dortmund), Emil-Figge-Straße 42, Dortmund, 44227, Germany; Institute for Transfusion Medicine, University Hospital Essen, Hufelandstraße 55, Essen, 45147, Germany. Electronic address: henning.schaefer@uk-essen.de.
  • Ahmad Idrissi-Yaghir
    Department of Computer Science, University of Applied Sciences and Arts Dortmund (FH Dortmund), Emil-Figge-Straße 42, Dortmund, 44227, Germany; Institute for Medical Informatics, Biometry and Epidemiology (IMIBE), University Hospital Essen, Hufelandstraße 55, Essen, 45147, Germany.
  • Jeanette Bewersdorff
    Computational Linguistics, CATALPA - Center for Advanced Technology-Assisted Learning and Predictive Analytics, FernUniversität in Hagen, Germany.
  • Sameh Frihat
    University of Duisburg-Essen, Germany.
  • Christoph M Friedrich
    Department of Computer Science, University of Applied Sciences and Arts Dortmund, Dortmund, Germany.
  • Torsten Zesch
    Computational Linguistics, CATALPA - Center for Advanced Technology-Assisted Learning and Predictive Analytics, FernUniversität in Hagen, Germany.