Towards Convolutional Neural Network Acceleration and Compression Based on -Means.

Journal: Sensors (Basel, Switzerland)
Published Date:

Abstract

Convolutional Neural Networks (CNNs) are popular models that are widely used in image classification, target recognition, and other fields. Model compression is a common step in transplanting neural networks into embedded devices, and it is often used in the retraining stage. However, it requires a high expenditure of time by retraining weight data to atone for the loss of precision. Unlike in prior designs, we propose a novel model compression approach based on -means, which is specifically designed to support a hardware acceleration scheme. First, we propose an extension algorithm named -means based on simple -means. We use -means to cluster trained weights in convolutional layers and fully connected layers. Second, we reduce the consumption of hardware resources in data movement and storage by using a data storage and index approach. Finally, we provide the hardware implementation of the compressed CNN accelerator. Our evaluations on several classifications show that our design can achieve 5.27× compression and reduce 74.3% of the multiply-accumulate (MAC) operations in AlexNet on the FASHION-MNIST dataset.

Authors

  • Mingjie Wei
    The College of Computer Science, National University of Defence Technology, Changsha 410000, China.
  • Yunping Zhao
    The College of Computer Science, National University of Defence Technology, Changsha 410000, China.
  • Xiaowen Chen
    The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA.
  • Chen Li
    School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi, China.
  • Jianzhuang Lu
    The College of Computer Science, National University of Defence Technology, Changsha 410000, China.