Scaling Responsible Medical Text Retrieval Across Silos: Evaluating Small Language Models with Retrieval-Augmented Generation.

Journal: Studies in health technology and informatics
Published Date:

Abstract

This study evaluates small language models with FAISS and Dynamic Top-k for diabetes-focused medical text retrieval, with all-mpnet-base-v2 performing best and FAISS (FAISS Facebook AI Similarity Search) maintaining sub-2 ms latency even at tenfold scale. Dynamic Top-k improved precision and nDCG (Normalized Discounted Cumulative Gain), showing that lightweight SLM-RAG (Small Language Model-Retrieval Augmented Generation) pipelines can approach larger-model accuracy while remaining highly scalable for cloud-based EMR (Electronic Medical Record) environments.

Authors

Keywords

No keywords available for this article.