Multilevel Attention Networks and Policy Reinforcement Learning for Image Caption Generation.

Journal: Big data
Published Date:

Abstract

The analysis of large-scale multimodal data has become very popular recently. Image captioning, whose goal is to describe the content of image with natural language automatically, is an essential and challenging task in artificial intelligence. Commonly, most existing image caption methods utilize the mixture of Convolutional Neural Network and Recurrent Neural Network framework. These methods either pay attention to global representation at the image level or only focus on the specific concepts, such as regions and objects. To make the most of characteristics about a given image, in this study, we present a novel model named Multilevel Attention Networks and Policy Reinforcement Learning for image caption generation. Specifically, our model is composed of a multilevel attention network module and a policy reinforcement learning module. In the multilevel attention network, the object-attention network aims to capture global and local details about objects, whereas the region-attention network obtains global and local features about regions. After that, a policy reinforcement learning algorithm is adopted to overcome the exposure bias problem in the training phase and solve the loss-evaluation mismatching problem at the caption generation stage. With the attention network and policy algorithm, our model can automatically generate accurate and natural sentences for any particular image. We carry out extensive experiments on the MSCOCO and Flickr30k data sets, demonstrating that our model is superior to other competitive methods.

Authors

  • Zhibo Zhou
    School of Computer Science and Engineering, Beihang University, Beijing, China.
  • Xiaoming Zhang
    Center for Clinical Epidemiology and Biostatistics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
  • Zhoujun Li
  • Feiran Huang
  • Jie Xu
    Second Affiliated Hospital Zhejiang University School of Medicine, Hangzhou, Zhejiang Province 310000, China.