Expediting data extraction using a large language model (LLM) and scoping review protocol: a methodological study within a complex scoping review

Journal: arXiv

Published Date: Jul 9, 2025

Abstract

The data extraction stages of reviews are resource-intensive, and researchers may seek to expediate data extraction using online (large language models) LLMs and review protocols. Claude 3.5 Sonnet was used to trial two approaches that used a review protocol to prompt data extraction from 10 evidence sources included in a case study scoping review. A protocol-based approach was also used to review extracted data. Limited performance evaluation was undertaken which found high accuracy for the two extraction approaches (83.3% and 100%) when extracting simple, well-defined citation details; accuracy was lower (9.6% and 15.8%) when extracting more complex, subjective data items. Considering all data items, both approaches had precision >90% but low recall (<25%) and F1 scores (<40%). The context of a complex scoping review, open response types and methodological approach likely impacted performance due to missed and misattributed data. LLM feedback considered the baseline extraction accurate and suggested minor amendments: four of 15 (26.7%) to citation details and 8 of 38 (21.1%) to key findings data items were considered to potentially add value. However, when repeating the process with a dataset featuring deliberate errors, only 2 of 39 (5%) errors were detected. Review-protocol-based methods used for expediency require more robust performance evaluation across a range of LLMs and review contexts with comparison to conventional prompt engineering approaches. We recommend researchers evaluate and report LLM performance if using them similarly to conduct data extraction or review extracted data. LLM feedback contributed to protocol adaptation and may assist future review protocol drafting.

Authors

James Stewart-Evans
Emma Wilson
Tessa Langley
Andrew Prayle
Angela Hands
Karen Exley
Jo Leonardi-Bee

External Resources

View on arXiv arXiv (http://arxiv.org/abs/2507.06623v1)

Expediting data extraction using a large language model (LLM) and scoping review protocol: a methodological study within a complex scoping review

Abstract

Authors

Categories

External Resources

Popular Topics

Recent Journals

Expediting data extraction using a large language model (LLM) and scoping review protocol: a methodological study within a complex scoping review

Abstract

Authors

Categories

External Resources

Don't Miss the Future of Medicine

Popular Topics

Recent Journals