Intrinsic plasticity coding improved spiking actor network for reinforcement learning.

Journal: Neural networks : the official journal of the International Neural Network Society
PMID:

Abstract

Deep reinforcement learning (DRL) exploits the powerful representational capabilities of deep neural networks (DNNs) and has achieved significant success. However, compared to DNNs, spiking neural networks (SNNs), which operate on binary signals, more closely resemble the biological characteristics of efficient learning observed in the brain. In SNNs, spiking neurons exhibit complex dynamic characteristics and learn based on principles of biological plasticity. Inspired by the brain's efficient computational mechanisms, information encoding plays a critical role in these networks. We propose an intrinsic plasticity coding improved spiking actor network (IP-SAN) for RL to achieve effective decision-making. The IP-SAN integrates adaptive population coding at the network level with dynamic spiking neuron coding at the neuron level, improving spatiotemporal state representation and promoting more accurate biological simulation. Experimental results show that our IP-SAN outperforms several state-of-the-art methods in five continuous control tasks.

Authors

  • Xingyue Liang
    School of Artificial Intelligence, Anhui University, Hefei, 230601, Anhui, China; Engineering Research Center of Autonomous Unmanned System Technology, Ministry of Education, Hefei, 230601, Anhui, China; Anhui Provincial Engineering Research Center for Unmanned Systems and Intelligent Technology, Hefei, 230601, Anhui, China. Electronic address: liangxy@stu.ahu.edu.cn.
  • Qiaoyun Wu
    School of Artificial Intelligence, Anhui University, Hefei, 230601, Anhui, China; Engineering Research Center of Autonomous Unmanned System Technology, Ministry of Education, Hefei, 230601, Anhui, China; Anhui Provincial Engineering Research Center for Unmanned Systems and Intelligent Technology, Hefei, 230601, Anhui, China. Electronic address: wuqiaoyun@ahu.edu.cn.
  • Wenzhang Liu
    School of Artificial Intelligence, Anhui University, Hefei, 230601, Anhui, China; Engineering Research Center of Autonomous Unmanned System Technology, Ministry of Education, Hefei, 230601, Anhui, China; Anhui Provincial Engineering Research Center for Unmanned Systems and Intelligent Technology, Hefei, 230601, Anhui, China. Electronic address: wzliu@ahu.edu.cn.
  • Yun Zhou
    MOE Key Lab of Environmental and Occupational Health, School of Public Health, Tongji Medical College, Huazhong University of Science & Technology, Wuhan 430030, China.
  • Chunyu Tan
    School of Artificial Intelligence, Anhui University, Hefei, 230601, Anhui, China; Engineering Research Center of Autonomous Unmanned System Technology, Ministry of Education, Hefei, 230601, Anhui, China; Anhui Provincial Engineering Research Center for Unmanned Systems and Intelligent Technology, Hefei, 230601, Anhui, China. Electronic address: cytan@ahu.edu.cn.
  • Hongfu Yin
    School of Artificial Intelligence, Anhui University, Hefei, 230601, Anhui, China; Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, 230601, Anhui, China. Electronic address: yinhf@stu.ahu.edu.cn.
  • Changyin Sun
    School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing 100083, China. Electronic address: cys@ustb.edu.cn.