Highly valued subgoal generation for efficient goal-conditioned reinforcement learning.

Journal: Neural networks : the official journal of the International Neural Network Society
PMID:

Abstract

Goal-conditioned reinforcement learning is widely used in robot control, manipulating the robot to accomplish specific tasks by maximizing accumulated rewards. However, the useful reward signal is only received when the desired goal is reached, leading to the issue of sparse rewards and affecting the efficiency of policy learning. In this paper, we propose a method to generate highly valued subgoals for efficient goal-conditioned policy learning, enabling the development of smart home robots or automatic pilots in our daily life. The highly valued subgoals are conditioned on the context of the specific tasks and characterized by suitable complexity for efficient goal-conditioned action value learning. The context variable captures the latent representation of the particular tasks, allowing for efficient subgoal generation. Additionally, the goal-conditioned action values regularized by the self-adaptive ranges generate subgoals with suitable complexity. Compared to Hindsight Experience Replay that uniformly samples subgoals from visited trajectories, our method generates the subgoals based on the context of tasks with suitable difficulty for efficient policy training. Experimental results show that our method achieves stable performance in robotic environments compared to baseline methods.

Authors

  • Yao Li
    Center of Robotics and Intelligent Machine, Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Science, No. 266 Fangzhen Road, Beibei District, Chongqing, 400714, China.
  • Yuhui Wang
    School of Accounting, Harbin University of Commerce, Harbin 150028, Heilongjiang, China.
  • XiaoYang Tan
    College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, China. Electronic address: x.tan@nuaa.edu.cn.