RePaIR: Repaired pruning at initialization resilience.

Journal: Neural networks : the official journal of the International Neural Network Society
Published Date:

Abstract

Over the past decade, the size of neural network models has gradually increased in both breadth and depth, leading to a growing interest in the application of neural network pruning. Unstructured pruning provides fine-grained sparsity and achieves better inference acceleration under specific hardware support. Unstructured Pruning at Initialization (PaI) optimizes the iterative pruning pipeline, but sparse weights increase the risk of underfitting during training. More importantly, almost all PaI algorithms focus only on obtaining the best pruning mask without considering whether the retained weights are suitable for training. Introducing Lipschitz constants during model initialization can reduce the risk of model underfitting and overfitting. As a result, we firstly analyze the impact of Lipschitz initialization on model training and propose the Repaired Initialization (ReI) algorithm for common modules with BatchNorm. Then, we utilize the same idea to repair the weight of unstructured pruned model, and name it Repaired Pruning at Initialization Resilience (RePaIR) algorithm. Extensive experiments and demonstrate that our proposed ReI and RePaIR can improve the training robustness of unpruned and pruned models, respectively, and achieve up to 1.7% accuracy gain with the same sparse pruning mask on TinyImageNet. Furthermore, we provide an improved SynFlow algorithm called Repair SynFlow (ReSynFlow), which employs Lipschitz scaling to overcome the problem of score computation in deeper models. ReSynFlow can effectively improve the maximum compression rate and is suitable for deeper models, with an accuracy improvement of up to 1.3% compared to the SynFlow algorithm on TinyImageNet.

Authors

  • Haocheng Zhao
    Institute of Deep Perception Technology, JITRI, 214000, Wuxi, China; Department of Electrical Engineering and Electronics, University of Liverpool, L69 3BX, Liverpool, United Kingdom; Department of School of Advanced Technology, Xi'an Jiaotong-Liverpool University, 215123, Suzhou, China; XJTLU-JITRI Academy of Technology, Xi'an Jiaotong-Liverpool University, 215123, Suzhou, China. Electronic address: Haocheng.Zhao19@student.xjtlu.edu.cn.
  • Runwei Guan
    Institute of Deep Perception Technology, JITRI, 214000, Wuxi, China; Department of Electrical Engineering and Electronics, University of Liverpool, L69 3BX, Liverpool, United Kingdom; Department of School of Advanced Technology, Xi'an Jiaotong-Liverpool University, 215123, Suzhou, China; XJTLU-JITRI Academy of Technology, Xi'an Jiaotong-Liverpool University, 215123, Suzhou, China. Electronic address: Runwei.Guan21@student.xjtlu.edu.cn.
  • Ka Lok Man
    Department of School of Advanced Technology, Xi'an Jiaotong-Liverpool University, 215123, Suzhou, China. Electronic address: Ka.Man@xjtlu.edu.cn.
  • Limin Yu
    Department of Pathology, Beaumont Hospital, Royal Oak, Michigan.
  • Yutao Yue
    Institute of Deep Perception Technology, JITRI, 214000, Wuxi, China; XJTLU-JITRI Academy of Technology, Xi'an Jiaotong-Liverpool University, 215123, Suzhou, China; Thrust of Artificial Intelligence and Thrust of Intelligent Transportation, The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, 511400, China. Electronic address: yueyutao@hkust-gz.edu.cn.