ADFQ-ViT: Activation-Distribution-Friendly post-training Quantization for Vision Transformers.

Journal: Neural networks : the official journal of the International Neural Network Society
Published Date:

Abstract

Vision Transformers (ViTs) have exhibited exceptional performance across diverse computer vision tasks, while their substantial parameter size incurs significantly increased memory and computational demands, impeding effective inference on resource-constrained devices. Quantization has emerged as a promising solution to mitigate these challenges, yet existing methods still suffer from significant accuracy loss at low-bit. We attribute this issue to the distinctive distributions of post-LayerNorm and post-GELU activations within ViTs, rendering conventional hardware-friendly quantizers ineffective, particularly in low-bit scenarios. To address this issue, we propose a novel framework called Activation-Distribution-Friendly post-training Quantization for Vision Transformers, ADFQ-ViT. Concretely, we introduce the Per-Patch Outlier-aware Quantizer to tackle irregular outliers in post-LayerNorm activations. This quantizer refines the granularity of the uniform quantizer to a per-patch level while retaining a minimal subset of values exceeding a threshold at full-precision. To handle the non-uniform distributions of post-GELU activations between positive and negative regions, we design the Shift-Log2 Quantizer, which shifts all elements to the positive region and then applies log2 quantization. Moreover, we present the Attention-score enhanced Module-wise Optimization which adjusts the parameters of each quantizer by reconstructing errors to further mitigate quantization error. Extensive experiments demonstrate ADFQ-ViT provides significant improvements over various baselines in image classification, object detection, and instance segmentation tasks at 4-bit. Specifically, when quantizing the ViT-B model to 4-bit, we achieve a 5.17% improvement in Top-1 accuracy on the ImageNet dataset. Our code is available at: https://github.com/llwx593/adfq-vit.git.

Authors

  • Yanfeng Jiang
    State Key Laboratory of Genetic Engineering, Human Phenome Institute, and School of Life Sciences, Fudan University, Shanghai, China.
  • Ning Sun
    State Key Laboratory of Chirosciences and Department of Applied Biology and Chemical Technology, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong SAR, China.
  • Xueshuo Xie
    Haihe Lab of ITAI, Tianjin, China.
  • Fei Yang
    Hunan Province Key Laboratory of Typical Environmental Pollution and Health Hazards, School of Public Health, University of South China, Hengyang 421001, China.
  • Tao Li
    Department of Emergency Medicine, Jining No.1 People's Hospital, Jining, China.