RealMedQA: A pilot biomedical question answering dataset containing realistic clinical questions.

Journal: AMIA ... Annual Symposium proceedings. AMIA Symposium
Published Date:

Abstract

Clinical question answering systems have the potential to provide clinicians with relevant and timely answers to their questions. Nonetheless, despite the advances that have been made, adoption of these systems in clinical settings has been slow. One issue is a lack of question-answering datasets which reflect the real-world needs of health professionals. In this work, we present RealMedQA, a dataset of realistic clinical questions generated by humans and an LLM. We describe the process for generating and verifying the QA pairs and assess several QA models on BioASQ and RealMedQA to assess the relative difficulty of matching answers to questions. We show that the LLM is more cost-efficient for generating "ideal" QA pairs. Additionally, we achieve a lower lexical similarity between questions and answers than BioASQ which provides an additional challenge to the top two QA models, as per the results. We release our code and our dataset publicly to encourage further research.

Authors

  • Gregory Kell
    School of Life Course and Population Sciences, King's College London, London, UK.
  • Angus Roberts
    Department of Computer Science, University of Sheffield, Sheffield, UK.
  • Serge Umansky
    Metadvice Ltd., London, Greater London, United Kingdom.
  • Yuti Khare
    Maidstone and Tunbridge Wells NHS Trust, Maidstone, Kent, United Kingdom.
  • Najma Ahmed
    King's College London, London, Greater London, United Kingdom.
  • Nikhil Patel
    King's College London, London, Greater London, United Kingdom.
  • Chloe Simela
    King's College London, London, Greater London, United Kingdom.
  • Jack Coumbe
    King's College London, London, Greater London, United Kingdom.
  • Julian Rozario
    King's College London, London, Greater London, United Kingdom.
  • Ryan-Rhys Griffiths
    University of Cambridge, Cambridgeshire, United Kingdom.
  • Iain J Marshall
    Department of Primary Care and Public Health Sciences, King's College London, UK iain.marshall@kcl.ac.uk.