Addi-Reg: A Better Generalization-Optimization Tradeoff Regularization Method for Convolutional Neural Networks.

Journal: IEEE transactions on cybernetics
Published Date:

Abstract

In convolutional neural networks (CNNs), generating noise for the intermediate feature is a hot research topic in improving generalization. The existing methods usually regularize the CNNs by producing multiplicative noise (regularization weights), called multiplicative regularization (Multi-Reg). However, Multi-Reg methods usually focus on improving generalization but fail to jointly consider optimization, leading to unstable learning with slow convergence. Moreover, Multi-Reg methods are not flexible enough since the regularization weights are generated from a definite manual-design distribution. Besides, most popular methods are not universal enough, because these methods are only designed for the residual networks. In this article, we, for the first time, experimentally and theoretically explore the nature of generating noise in the intermediate features for popular CNNs. We demonstrate that injecting noise in the feature space can be transformed to generating noise in the input space, and these methods regularize the networks in a Mini-batch in Mini-batch (MiM) sampling manner. Based on these observations, this article further discovers that generating multiplicative noise can easily degenerate the optimization due to its high dependence on the intermediate feature. Based on these studies, we propose a novel additional regularization (Addi-Reg) method, which can adaptively produce additional noise with low dependence on intermediate feature in CNNs by employing a series of mechanisms. Particularly, these well-designed mechanisms can stabilize the learning process in training, and our Addi-Reg method can pertinently learn the noise distributions for every layer in CNNs. Extensive experiments demonstrate that the proposed Addi-Reg method is more flexible and universal, and meanwhile achieves better generalization performance with faster convergence against the state-of-the-art Multi-Reg methods.

Authors

  • Yao Lu
    Department of Laboratory Medicine, The First Affiliated Hospital of Ningbo University, Ningbo First Hospital, Ningbo, China.
  • Zheng Zhang
    Key Laboratory of Sustainable and Development of Marine Fisheries, Ministry of Agriculture and Rural Affairs, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Qingdao, PR China.
  • Guangming Lu
    Department of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, China.
  • Yicong Zhou
  • Jinxing Li
    Department of NanoEngineering , University of California San Diego , La Jolla , California 92093 , United States.
  • David Zhang
    Department of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, China.