Auditory feature representation using convolutional restricted Boltzmann machine and Teager energy operator for speech recognition.
Journal:
The Journal of the Acoustical Society of America
Published Date:
Jun 1, 2017
Abstract
In this letter, authors propose an auditory feature representation technique with the filterbank learned using an annealing dropout convolutional restricted Boltzmann machine (ConvRBM) and noise-robust energy estimation using the Teager energy operator (TEO). TEO is applied on each subband of ConvRBM filterbank and pooled later to get the short-term spectral features. Experiments on AURORA 4 database show that the proposed features perform better than the Mel filterbank features. The relative improvement of 2.59%-11.63% and 1.26%-6.87% in word error rate is achieved using the time delay neural network and the bidirectional long short-term memory models, respectively.