A fully value distributional deep reinforcement learning framework for multi-agent cooperation.

Journal: Neural networks : the official journal of the International Neural Network Society

PMID: 39693677

Abstract

Distributional Reinforcement Learning (RL) extends beyond estimating the expected value of future returns by modeling its entire distribution, offering greater expressiveness and capturing deeper insights of the value function. To leverage this advantage, distributional multi-agent systems based on value-decomposition techniques were proposed recently. Ideally, a distributional multi-agent system should be fully distributional, which means both the individual and global value functions should be constructed in distributional forms. However, recent studies show that directly applying traditional value-decomposition techniques to this fully distributional form cannot guarantee the satisfaction of the necessary individual-global-max (IGM) principle. To address this problem, we propose a novel fully value distributional multi-agent framework based on value-decomposition and prove that the IGM principle can be guaranteed under our framework. Based on this framework, a practical deep reinforcement learning model called Fully Distributional Multi-Agent Cooperation (FDMAC) is proposed, and the effectiveness of FDMAC is verified under different scenarios of the StarCraft Multi-Agent Challenge micromanagement environment. Further experimental results show that our FDMAC model can outperform the best baseline by 10.47% on average in terms of the median test win rate.

Authors

Mingsheng Fu

School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, Sichuan, China. Electronic address: fms@uestc.edu.cn.
Liwei Huang

School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, Sichuan, China. Electronic address: liweihuang@uestc.edu.cn.
Fan Li

Department of Instrument Science and Engineering, School of SEIEE, Shanghai Jiao Tong University, Shanghai 200240, China.
Hong Qu

Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, College of Life Sciences, Peking University, Beijing 100871, China.
Chengzhong Xu

State Key Laboratory of IoTSC, University of Macau, Taipa, 999078, Macao Special Administrative Region of China. Electronic address: czxu@um.edu.mo.

Keywords

Algorithms Cooperative Behavior Deep Learning Humans Neural Networks, Computer Reinforcement Machine Learning Reinforcement, Psychology

External Resources

View on PubMed Access via DOI PubMed (39693677)

A fully value distributional deep reinforcement learning framework for multi-agent cooperation.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals