Multiscale wavelet attention convolutional network for facial expression recognition.
Journal:
Scientific reports
Published Date:
Jul 1, 2025
Abstract
Deep learning techniques, particularly Convolutional Neural Networks (CNNs), have been widely recognized as effective tools for facial expression recognition applications. The accuracy of facial expression recognition application requires further enhancement. Main work and effects of this study are as follows: First, the first convolutional layer of CNN is substituted with a Multi-scale Convolutional (MsC) layer, resulting in the proposal of the Multi-scale CNN (MCNN). Experimental results indicate that MCNN achieves an average accuracy improvement of 1.339% over CNN. Second, a wavelet Channel Attention (wCA) mechanism is incorporated after the first pooling layer of CNN, leading to the proposal of the wCA-based CNN (wCA-CNN). Experimental results demonstrate that wCA-CNN achieves an average accuracy improvement of 1.414% over CNN. Third, by substituting the first convolutional layer of the CNN with the MsC layer and incorporating wCA mechanism after the first pooling layer, the wCA-based Multi-scale CNN (wCA-MCNN) is introduced. Experimental results reveal that wCA-MCNN achieves an average accuracy improvement of 2.921% compared to CNN. Fourth, the Residual Network (ResNet18) is selected as a baseline model and improved accordingly. Compared to ResNet18, the accuracy of the proposed MsC-ResNet18, wCA-ResNet18, and MsC-wCA-ResNet18 improved by 0.845%, 0.835%, and 1.810%, respectively. Fifth, all the above proposed methods are evaluated by two datasets: the Facial Expression of Students in Real-Class (FESR) dataset collected from our real classroom and the Karolinska Directed Emotional Faces (KDEF) dataset.