Prompt-to-policy: Leveraging large language models to guide deep reinforcement learning in public health emergencies.
Journal:
Computer methods and programs in biomedicine
Published Date:
Feb 18, 2026
Abstract
Rapid and effective decision-making is critical in public health emergencies, where resource allocation must balance multiple objectives under uncertain conditions. Traditional optimization methods often struggle with computational complexity and real-time, heterogeneous data. To address these challenges, we propose a hybrid intelligent agent combining an enhanced Double Deep Q-Network (D2QN-JDA) with large language models (LLMs). The D2QN-JDA improves learning stability and adaptability through joint state-action inputs, dynamic exploration rate, and adaptive reward normalization. The LLM component uses Retrieval-Augmented Generation (RAG) to integrate structured and unstructured data for real-time decision support. Experiments based on data from the Hong Kong COVID-19 outbreak show that the D2QN-JDA outperforms dynamic programming, greedy algorithms, genetic algorithms, and Q-learning, achieving reductions in cost. The LLM component also outperforms manual and regex methods in both single- and multi-point data extraction, enhancing accuracy, recall, F1 score, cost, and time. Our framework effectively addresses complex, multi-objective resource allocation in public health crises.
Authors
Keywords
No keywords available for this article.