Pushing the Limit of Post-Training Quantization.

Journal: IEEE transactions on pattern analysis and machine intelligence

Published Date: Jul 1, 2025

Abstract

Recently, post-training quantization (PTQ) has become the de facto way to produce efficient low-precision neural networks without long-time retraining. Despite its low cost, current PTQ works fail to succeed under the extremely low-bit setting. In this work, we delve into extremely low-bit quantization and construct a unified theoretical analysis, which provides an in-depth understanding of the reason for the failure of low-bit quantization. According to the theoretical study, we argue that the existing methods fail in low-bit schemes due to significant perturbation on weights and lack of consideration of activation quantization. To this end, we propose Brecq and QDrop to respectively solve these two challenges, based on which a Q-Limit framework is constructed. Then the Q-Limit framework is further extended to support a mixed precision quantization scheme. To the best of our knowledge, this is the first work that can push the limit of PTQ down to INT2. Extensive experiments on various handcrafted and searched neural architectures are conducted for both visual recognition/detection tasks and language processing tasks. Without bells and whistles, our PTQ framework can attain low-bit ResNet and MobileNetV2 comparable with quantization-aware training (QAT), establishing a new state-of-the-art for PTQ.

Authors

Ruihao Gong

Sensetime Research, No. 1900 Hongmei Road, Shanghai, 201103, Shanghai, China.
Xianglong Liu
Yuhang Li

Zhejiang Key Laboratory of Excited-State Energy Conversion and Energy Storage, Department of Chemistry, Zhejiang University, Hangzhou 310058, China.
Yunqiang Fan
Xiuying Wei
Jinyang Guo

Keywords

No keywords available for this article.

External Resources

View on PubMed Access via DOI PubMed (40184295)

Pushing the Limit of Post-Training Quantization.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals

Pushing the Limit of Post-Training Quantization.

Abstract

Authors

Keywords

External Resources

Don't Miss the Future of Medicine

Popular Topics

Recent Journals