The DRAGON benchmark for clinical NLP.

Journal: NPJ digital medicine
Published Date:

Abstract

Artificial Intelligence can mitigate the global shortage of medical diagnostic personnel but requires large-scale annotated datasets to train clinical algorithms. Natural Language Processing (NLP), including Large Language Models (LLMs), shows great potential for annotating clinical data to facilitate algorithm development but remains underexplored due to a lack of public benchmarks. This study introduces the DRAGON challenge, a benchmark for clinical NLP with 28 tasks and 28,824 annotated medical reports from five Dutch care centers. It facilitates automated, large-scale, cost-effective data annotation. Foundational LLMs were pretrained using four million clinical reports from a sixth Dutch care center. Evaluations showed the superiority of domain-specific pretraining (DRAGON 2025 test score of 0.770) and mixed-domain pretraining (0.756), compared to general-domain pretraining (0.734, p < 0.005). While strong performance was achieved on 18/28 tasks, performance was subpar on 10/28 tasks, uncovering where innovations are needed. Benchmark, code, and foundational LLMs are publicly available.

Authors

  • Joeran S Bosma
    University Medical Centre Nijmegen, DIAG, Geert Grooteplein Zuid 10, 6525 GA, Nijmegen, the Netherlands.
  • Koen Dercksen
    Diagnostic Image Analysis Group, Department of Medical Imaging, Radboud University Medical Center, Nijmegen, The Netherlands.
  • Luc Builtjes
    Diagnostic Image Analysis Group, Department of Medical Imaging, Radboud University Medical Center, Nijmegen, The Netherlands.
  • Romain André
    Diagnostic Image Analysis Group, Department of Medical Imaging, Radboud University Medical Center, Nijmegen, The Netherlands.
  • Christian Roest
    Medical Imaging Center, Departments of Radiology, Nuclear Medicine and Molecular Imaging, University Medical Center Groningen, University of Groningen, Meditech Building, Room 305, Hanzeplein 1, 9700 RB, Groningen, The Netherlands.
  • Stefan J Fransen
    University Medical Centre Groningen, Department of Radiology, Hanzeplein 1, 9713 GZ, Groningen, the Netherlands. Electronic address: S.j.fransen@umcg.nl.
  • Constant R Noordman
    Diagnostic Image Analysis Group, Department of Medical Imaging, Radboud University Medical Center, Nijmegen, The Netherlands.
  • Mar Navarro-Padilla
    Diagnostic Image Analysis Group, Department of Medical Imaging, Radboud University Medical Center, Nijmegen, The Netherlands.
  • Judith Lefkes
    Computational Pathology Group, Department of Pathology, Radboud University Medical Center, Nijmegen, The Netherlands.
  • Natália Alves
    Diagnostic Imaging Analysis Group, Medical Imaging Department, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA, Nijmegen, the Netherlands.
  • Max J J de Grauw
    Diagnostic Image Analysis Group, Department of Medical Imaging, Radboud University Medical Center, Nijmegen, The Netherlands.
  • Leander van Eekelen
    Faculty of Biomedical Engineering, Technical University Eindhoven, Eindhoven, the Netherlands; Computational Pathology Group, Radboud Institute for Health Sciences, Radboud University Medical Center, Nijmegen, the Netherlands.
  • Joey M A Spronck
    Computational Pathology Group, Department of Pathology, Radboud University Medical Center, Nijmegen, The Netherlands.
  • Megan Schuurmans
    Diagnostic Image Analysis Group, Department of Medical Imaging, Radboud University Medical Center, Nijmegen, The Netherlands.
  • Bram de Wilde
    From the Department of Radiology, Nuclear Medicine and Anatomy, Radboud University Medical Center, P.O. Box 9101, 6500 HB Nijmegen, the Netherlands (N.L., C.I.S., L.H.B., M.B., E.C., W.M.v.E., P.K.G., B.G., M.G., N.H., W.H., H.J.H., C.J., R.K., M.K., K.v.L., J.M., M.O., R.S., C. Schaefer-Prokop, S.S., E.T.S., C. Sital, J.T., K.V.V., C.d.V., W.X., B.d.W., M.P., B.v.G.); Department of Radiology, Academic Medical Center, Amsterdam, the Netherlands (L.B.); Thirona, Nijmegen, the Netherlands (J.P.C., E.M.v.R.); Departments of Internal Medicine (T.D.) and Radiology (M.V.), Canisius-Wilhelmina Ziekenhuis, Nijmegen, the Netherlands; Department of Radiology and Nuclear Medicine, Maastricht University Medical Center, Maastricht, the Netherlands (H.A.G.); GROW School of Oncology and Developmental Biology, Maastricht, the Netherlands (H.A.G.); Departments of Biomedical Physics and Engineering and Radiology and Nuclear Medicine, Amsterdam University Medical Center, Amsterdam, the Netherlands (L.v.H., I.I.); Department of Radiology, Zuyderland Medical Center, Heerlen, the Netherlands (J.K.); Fraunhofer Institute for Digital Medicine MEVIS, Bremen, Germany (B.L.); Department of Radiology and Nuclear Medicine, Haaglanden Medical Center, The Hague, the Netherlands (T.v.R.V.); Department of Radiology, Meander Medical Center, Amersfoort, the Netherlands (C. Schaefer-Prokop, S.S.); and Department of Radiology, Leiden University Medical Center, Leiden, the Netherlands (J.L.S.).
  • Ward Hendrix
    From the Department of Radiology, Nuclear Medicine and Anatomy, Radboud University Medical Center, P.O. Box 9101, 6500 HB Nijmegen, the Netherlands (N.L., C.I.S., L.H.B., M.B., E.C., W.M.v.E., P.K.G., B.G., M.G., N.H., W.H., H.J.H., C.J., R.K., M.K., K.v.L., J.M., M.O., R.S., C. Schaefer-Prokop, S.S., E.T.S., C. Sital, J.T., K.V.V., C.d.V., W.X., B.d.W., M.P., B.v.G.); Department of Radiology, Academic Medical Center, Amsterdam, the Netherlands (L.B.); Thirona, Nijmegen, the Netherlands (J.P.C., E.M.v.R.); Departments of Internal Medicine (T.D.) and Radiology (M.V.), Canisius-Wilhelmina Ziekenhuis, Nijmegen, the Netherlands; Department of Radiology and Nuclear Medicine, Maastricht University Medical Center, Maastricht, the Netherlands (H.A.G.); GROW School of Oncology and Developmental Biology, Maastricht, the Netherlands (H.A.G.); Departments of Biomedical Physics and Engineering and Radiology and Nuclear Medicine, Amsterdam University Medical Center, Amsterdam, the Netherlands (L.v.H., I.I.); Department of Radiology, Zuyderland Medical Center, Heerlen, the Netherlands (J.K.); Fraunhofer Institute for Digital Medicine MEVIS, Bremen, Germany (B.L.); Department of Radiology and Nuclear Medicine, Haaglanden Medical Center, The Hague, the Netherlands (T.v.R.V.); Department of Radiology, Meander Medical Center, Amersfoort, the Netherlands (C. Schaefer-Prokop, S.S.); and Department of Radiology, Leiden University Medical Center, Leiden, the Netherlands (J.L.S.).
  • Witali Aswolinskiy
    Computational Pathology Group, Department of Pathology, Radboud University Medical Center, Nijmegen 6525 GA, The Netherlands.
  • Anindo Saha
    Diagnostic Image Analysis Group, Department of Medical Imaging, Radboudumc, Nijmegen, The Netherlands.
  • Jasper J Twilt
    Minimally Invasive Image-Guided Intervention Center, Radboud University Medical Center, Nijmegen, Netherlands.
  • Daan Geijs
    Radboud University Medical Center, Nijmegen, the Netherlands.
  • Jeroen Veltman
    Multi-Modality Medical Imaging, Technical Medical Centre, University of Twente, Enschede, The Netherlands; Department of Radiology, Ziekenhuisgroep Twente, Almelo, The Netherlands.
  • Derya Yakar
    Department of Radiology, Medical Imaging Center, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands. Electronic address: d.yakar@umcg.nl.
  • Maarten de Rooij
    Department of Medical Imaging, Radboud university medical center, P.O. Box 9101, 6500 HB, Nijmegen, The Netherlands.
  • Francesco Ciompi
    Diagnostic Image Analysis Group, Department of Radiology and Nuclear Medicine, Radboud University Medical Center, Nijmegen, The Netherlands. Electronic address: francesco.ciompi@radboudumc.nl.
  • Alessa Hering
    Fraunhofer Institute for Digital Medicine MEVIS, Maria-Goeppert-Str. 3, 23562, Lübeck, Germany. alessa.hering@mevis.fraunhofer.de.
  • Jeroen Geerdink
    Department of Health & Information Technology, Ziekenhuisgroep Twente, Almelo, The Netherlands.
  • Henkjan Huisman
    Department of Radiology and Nuclear Medicine, Radboud University Medical Center, Nijmegen, The Netherlands.

Keywords

No keywords available for this article.