Evaluating the Performance of Claude 3.7 Sonnet in Data Extraction Automation for Systematic Literature Reviews.

Journal: Value in health regional issues
Published Date:

Abstract

OBJECTIVES: To evaluate the performance of Claude 3.7 Sonnet in automating data extraction for systematic literature reviews (SLRs). METHODS: An artificial intelligence (AI) extraction model based on the Claude 3.7 Sonnet large language model was developed through a structured process, including targeted training using a master data list and selected full-text articles. The master data list enhanced the model's contextual knowledge, guiding data extraction. Seven full-text articles from 4 oncology-focused treatment efficacy and safety SLRs were used for early testing and iterative refinement through error analysis. Model performance was then evaluated using 20 full-text articles, drawn from the same SLRs but not used for model development, and benchmarked against human extractions. Evaluation metrics included precision, recall, and F1 score. Extraction time was also compared across 3 different approaches: AI model-only, hybrid (AI model with human verification), and traditional human extraction. RESULTS: The AI model extracted 117 889 data points across 106 variables, achieving an overall precision of 98.2%, recall of 96.6%, and F1-score of 97.4%. Extraction performance was highest for Study Characteristics (precision: 97.7%, recall: 98.7%) and Participant Characteristics (precision: 97.3%, recall: 98.7%). Outcome data showed 96.4% recall and 98.7% precision. Intervention Characteristics achieved 97.5% precision and 94.6% recall. Extraction using the AI model alone averaged 4.5 minutes per article, compared with 64.5 minutes with the hybrid approach and approximately 240 minutes with traditional human extraction. CONCLUSIONS: The Claude 3.7 Sonnet-based model demonstrated strong performance, supporting efficient and reliable AI-driven data extraction in oncology SLRs, with potential for broader applicability.

Authors

Keywords

No keywords available for this article.