POLYRAG: Integrating Polyviews into Retrieval-Augmented Generation for Medical Applications
Journal:
arXiv
Published Date:
Apr 21, 2025
Abstract
Large language models (LLMs) have become a disruptive force in the industry,
introducing unprecedented capabilities in natural language processing, logical
reasoning and so on. However, the challenges of knowledge updates and
hallucination issues have limited the application of LLMs in medical scenarios,
where retrieval-augmented generation (RAG) can offer significant assistance.
Nevertheless, existing retrieve-then-read approaches generally digest the
retrieved documents, without considering the timeliness, authoritativeness and
commonality of retrieval. We argue that these approaches can be suboptimal,
especially in real-world applications where information from different sources
might conflict with each other and even information from the same source in
different time scale might be different, and totally relying on this would
deteriorate the performance of RAG approaches. We propose PolyRAG that
carefully incorporate judges from different perspectives and finally integrate
the polyviews for retrieval augmented generation in medical applications. Due
to the scarcity of real-world benchmarks for evaluation, to bridge the gap we
propose PolyEVAL, a benchmark consists of queries and documents collected from
real-world medical scenarios (including medical policy, hospital & doctor
inquiry and healthcare) with multiple tagging (e.g., timeliness,
authoritativeness) on them. Extensive experiments and analysis on PolyEVAL have
demonstrated the superiority of PolyRAG.