PrivCore: Multiplication-activation co-reduction for efficient private inference.

Journal: Neural networks : the official journal of the International Neural Network Society
PMID:

Abstract

The marriage of deep neural network (DNN) and secure 2-party computation (2PC) enables private inference (PI) on the encrypted client-side data and server-side models with both privacy and accuracy guarantees, coming at the cost of orders of magnitude communication and latency penalties. Prior works on designing PI-friendly network architectures are confined to mitigating the overheads associated with non-linear (e.g., ReLU) operations, assuming other linear computations are free. Recent works have shown that linear convolutions can no longer be ignored and are responsible for the majority of communication in PI protocols. In this work, we present PrivCore, a framework that jointly optimizes the alternating linear and non-linear DNN operators via a careful co-design of sparse Winograd convolution and fine-grained activation reduction, to improve high-efficiency ciphertext computation without impacting the inference precision. Specifically, being aware of the incompatibility between the spatial pruning and Winograd convolution, we propose a two-tiered Winograd-aware structured pruning method that removes spatial filters and Winograd vectors from coarse to fine-grained for multiplication reduction, both of which are specifically optimized for Winograd convolution in a structured pattern. PrivCore further develops a novel sensitivity-based differentiable activation approximation to automate the selection of ineffectual ReLUs and polynomial options. PrivCore also supports the dynamic determination of coefficient-adaptive polynomial replacement to mitigate the accuracy degradation. Extensive experiments on various models and datasets consistently validate the effectiveness of PrivCore, achieving 2.2× communication reduction with 1.8% higher accuracy compared with SENet (ICLR 2023) on CIFAR-100, and 2.0× total communication reduction with iso-accuracy compared with CoPriv (NeurIPS 2023) on ImageNet.

Authors

  • Zhi Pang
    Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, School of Cyber Science and Engineering, Wuhan University, Wuhan, 430072, China.
  • Lina Wang
    Department of Biochemistry and Molecular Biology, Shandong University School of MedicineJinan, P. R. China; Central Laboratory, The Second Hospital of Shandong UniversityJinan, P. R. China.
  • Fangchao Yu
  • Kai Zhao
    Department of Gastroenterology, Tongji Hospital of Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China.
  • Bo Zeng
    National Research Base of Intelligent Manufacturing Service, Chongqing Technology and Business University, Chongqing 400067, China.
  • Shuwang Xu
    Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, School of Cyber Science and Engineering, Wuhan University, China. Electronic address: shuwangxu@whu.edu.cn.