Leveraging Summary Guidance on Medical Report Summarization.

Journal: IEEE journal of biomedical and health informatics
Published Date:

Abstract

This study presents three deidentified large medical text datasets, named DISCHARGE, ECHO and RADIOLOGY, which contain 50 K, 16 K and 378 K pairs of report and summary that are derived from MIMIC-III, respectively. We implement convincing baselines of automated abstractive summarization on the created datasets with pre-trained encoder-decoder language models, including BERT2BERT, BERTShare, RoBERTaShare, Pegasus, ProphetNet, T5-large, BART and GSUM. Further, based on the BART model, we leverage the sampled summaries from the training set as prior knowledge guidance, for encoding additional contextual representations of the guidance with the encoder and enhancing the decoding representations in the decoder. The experimental results confirm the improvement of ROUGE scores and BERTScore made by the proposed method.

Authors

  • Yunqi Zhu
  • Xuebing Yang
    Institute of Automation, Chinese Academy of Sciences, Beijing, China.
  • Yuanyuan Wu
    Department of Mathematics, Southeast University, Nanjing 210096, China; College of Electric and Information Engineering, Zhengzhou University of Light Industry, Zhengzhou 450002, China.
  • Wensheng Zhang
    Department of Anesthesiology, West China Hospital, Sichuan University, Chengdu, China.