A Multi-Group Multi-Stream attribute Attention network for fine-grained zero-shot learning.

Journal: Neural networks : the official journal of the International Neural Network Society
PMID:

Abstract

Fine-grained visual categorization in zero-shot setting is a challenging problem in the computer vision community. It requires algorithms to accurately identify fine-grained categories that do not appear during the training phase and have high visual similarity to each other. Existing methods usually address this problem by using attribute information as intermediate knowledge, which provides sufficient fine-grained characteristics of categories and can be transferred from seen categories to unseen categories. However, the learning of attribute visual features is not trivial due to the following two reasons: (i) The visual information about attributes of different types may interfere with the visual feature learning of each other. (ii) The visual characteristics of the same attribute may vary in different categories. To solve these issues, we propose a Multi-Group Multi-Stream attribute Attention network (MGMSA), which not only separates the feature learning of attributes of different types, but also isolates the learning of attribute visual features for categories with big differences in attribute appearance. This avoids the interference between uncorrelated attributes and helps to learn category-specific attribute-related visual features. This is beneficial for distinguishing fine-grained categories with subtle visual differences. Extensive experiments on benchmark datasets show that MGMSA achieves state-of-the-art performance on attribute prediction and fine-grained zero-shot learning.

Authors

  • Lingyun Song
    School of Computer Science, Northwestern Polytechnical University, Xi'an, 710129, China; Key Laboratory of Big Data Storage and Management, Northwestern Polytechnical University, Ministry of Industry and Information Technology, Xi'an, 710129, China. Electronic address: lysong@nwpu.edu.cn.
  • Xuequn Shang
  • Ruizhi Zhou
    Department of Radiology, the Affiliated Hospital of Qingdao University, Qingdao, Shandong, China.
  • Jun Liu
    Department of Radiology, Second Xiangya Hospital, Changsha, Hunan, China.
  • Jie Ma
    Respiratory Department, Beijing Hospital of Integrated Traditional Chinese and Western Medicine, Beijing, China.
  • Zhanhuai Li
    School of Computer Science, Northwestern Polytechnical University, Xi'an, 710129, China; Key Laboratory of Big Data Storage and Management, Northwestern Polytechnical University, Ministry of Industry and Information Technology, Xi'an, 710129, China. Electronic address: lizhh@mail.nwpu.edu.cn.
  • Mingxuan Sun