Deep Neural Networks with Multistate Activation Functions.

Journal: Computational intelligence and neuroscience
Published Date:

Abstract

We propose multistate activation functions (MSAFs) for deep neural networks (DNNs). These MSAFs are new kinds of activation functions which are capable of representing more than two states, including the N-order MSAFs and the symmetrical MSAF. DNNs with these MSAFs can be trained via conventional Stochastic Gradient Descent (SGD) as well as mean-normalised SGD. We also discuss how these MSAFs perform when used to resolve classification problems. Experimental results on the TIMIT corpus reveal that, on speech recognition tasks, DNNs with MSAFs perform better than the conventional DNNs, getting a relative improvement of 5.60% on phoneme error rates. Further experiments also reveal that mean-normalised SGD facilitates the training processes of DNNs with MSAFs, especially when being with large training sets. The models can also be directly trained without pretraining when the training set is sufficiently large, which results in a considerable relative improvement of 5.82% on word error rates.

Authors

  • Chenghao Cai
    School of Technology, Beijing Forestry University, No. 35 Qinghuadong Road, Haidian District, Beijing 100083, China.
  • Yanyan Xu
    School of Information Science and Technology, Beijing Forestry University, No. 35 Qinghuadong Road, Haidian District, Beijing 100083, China.
  • Dengfeng Ke
    Institute of Automation, Chinese Academy of Sciences, No. 95 Zhongguancundong Road, Haidian District, Beijing 100190, China.
  • Kaile Su
    College of Mathematics Physics and Information Engineering, Zhejiang Normal University, No. 688 Yingbin Road, Jinhua 321004, China.