Feature Aggregation With Reinforcement Learning for Video-Based Person Re-Identification.

Journal: IEEE transactions on neural networks and learning systems
Published Date:

Abstract

Video-based person re-identification (re-id) matches two tracks of persons from different cameras. Features are extracted from the images of a sequence and then aggregated as a track feature. Compared to existing works that aggregate frame features by simply averaging them or using temporal models such as recurrent neural networks, we propose an intelligent feature aggregate method based on reinforcement learning. Specifically, we train an agent to determine which frames in the sequence should be abandoned in the aggregation, which can be treated as a decision making process. By this way, the proposed method avoids introducing noisy information of the sequence and retains these valuable frames when generating a track feature. On benchmark data sets, experimental results show that our method can boost the re-id accuracy obviously based on the state-of-the-art models.

Authors

  • Wei Zhang
    The First Affiliated Hospital of Nanchang University, Nanchang, China.
  • Xuanyu He
  • Weizhi Lu
  • Hong Qiao
    State Key Lab of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of SciencesBeijing, China; Chinese Academy of Sciences Center for Excellence in Brain Science and Intelligence TechnologyShanghai, China; University of Chinese Academy of SciencesBeijing, China.
  • Yibin Li
    Institute of Food Science and Technology, Fujian Academy of Agricultural Sciences, Fuzhou 350003, China.