Evaluation of Chunking and Embedding Strategies for Local Document Retrieval Using an Open-Source LLM in a Hospital.
Journal:
Studies in health technology and informatics
Published Date:
Sep 3, 2025
Abstract
INTRODUCTION: The integration of Retrieval-Augmented Generation (RAG) into domain-specific systems enables context-aware and traceable information retrieval. This study explores chunking and embedding strategies for a RAG-based question-answering system tailored to administrative documents at University Hospital Halle, focusing on model selection, parameter tuning, and retrieval performance. The insights gained from this study should serve as the foundation for the future development of a Retrieval-Augmented Generation (RAG) based chatbot system that aims to facilitate access to document pool contents for hospital staff.