Object and spatial discrimination makes weakly supervised local feature better.

Journal: Neural networks : the official journal of the International Neural Network Society
PMID:

Abstract

Local feature extraction plays a crucial role in numerous critical visual tasks. However, there remains room for improvement in both descriptors and keypoints, particularly regarding the discriminative power of descriptors and the localization precision of keypoints. To address these challenges, this study introduces a novel local feature extraction pipeline named OSDFeat (Object and Spatial Discrimination Feature). OSDFeat employs a decoupling strategy, training descriptor and detection networks independently. Inspired by semantic correspondence, we propose an Object and Spatial Discrimination ResUNet (OSD-ResUNet). OSD-ResUNet captures features from the feature map that differentiate object appearance and spatial context, thus enhancing descriptor performance. To further improve the discriminative capability of descriptors, we propose a Discrimination Information Retained Normalization module (DIRN). DIRN complementarily integrates spatial-wise normalization and channel-wise normalization, yielding descriptors that are more distinguishable and informative. In the detection network, we propose a Cross Saliency Pooling module (CSP). CSP employs a cross-shaped kernel to aggregate long-range context in both vertical and horizontal dimensions. By enhancing the saliency of keypoints, CSP enables the detection network to effectively utilize descriptor information and achieve more precise localization of keypoints. Compared to the previous best local feature extraction methods, OSDFeat achieves Mean Matching Accuracy of 79.4% in local feature matching task, improving by 1.9% and achieving state-of-the-art results. Additionally, OSDFeat achieves competitive results in Visual Localization and 3D Reconstruction. The results of this study indicate that object and spatial discrimination can improve the accuracy and robustness of local feature, even in challenging environments. The code is available at https://github.com/pandaandyy/OSDFeat.

Authors

  • Yifan Yin
    School of Computer and Electronic Information, Guangxi University, Nanning, China. Electronic address: yifanyin_panda@163.com.
  • Mengxiao Yin
    School of Computer and Electronic Information, Guangxi University, Nanning, China; Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning, China. Electronic address: ymx@gxu.edu.cn.
  • Yunhui Xiong
    School of Mathematics, South China University of Technology, Guangzhou, China. Electronic address: yhxiong@scut.edu.cn.
  • Pengfei Lai
    School of Computer and Electronic Information, Guangxi University, Nanning, China. Electronic address: henryLaiVIP@163.com.
  • Kan Chang
    School of Computer and Electronic Information, Guangxi University, Nanning, Guangxi 530004, China; Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning, Guangxi 530004, China. Electronic address: changkan0@gmail.com.
  • Feng Yang