Block-term tensor neural networks.

Journal: Neural networks : the official journal of the International Neural Network Society
Published Date:

Abstract

Deep neural networks (DNNs) have achieved outstanding performance in a wide range of applications, e.g., image classification, natural language processing, etc. Despite the good performance, the huge number of parameters in DNNs brings challenges to efficient training of DNNs and also their deployment in low-end devices with limited computing resources. In this paper, we explore the correlations in the weight matrices, and approximate the weight matrices with the low-rank block-term tensors. We name the new corresponding structure as block-term tensor layers (BT-layers), which can be easily adapted to neural network models, such as CNNs and RNNs. In particular, the inputs and the outputs in BT-layers are reshaped into low-dimensional high-order tensors with a similar or improved representation power. Sufficient experiments have demonstrated that BT-layers in CNNs and RNNs can achieve a very large compression ratio on the number of parameters while preserving or improving the representation power of the original DNNs.

Authors

  • Jinmian Ye
    SMILE Lab, School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 610031, China.
  • Guangxi Li
    Center for Quantum Software and Information, University of Technology Sydney, NSW 2007, Australia.
  • Di Chen
    Department of Gastroenterology, Hainan General Hospital, Hainan Affiliated Hospital of Hainan Medical University, Haikou, China. Electronic address: 2389446889@qq.com.
  • Haiqin Yang
    Shenzhen Key Laboratory of Rich Media Big Data Analytics and Application, Shenzhen Research Institute, The Chinese University of Hong Kong, Hong Kong; Computer Science & Engineering, The Chinese University of Hong Kong, Hong Kong. Electronic address: hqyang@ieee.org.
  • Shandian Zhe
    Department of Computer Science, University of Utah, Salt Lake City, 84112 Utah, USA.
  • Zenglin Xu
    Big Data Research Center, University of Electronic Science & Technology, Chengdu, Sichuan, China; School of Computer Science and Engineering, University of Electronic Science & Technology, Chengdu, Sichuan, China. Electronic address: zlxu@uestc.edu.cn.