A Multi-view Mask Contrastive Learning Graph Convolutional Neural Network for Age Estimation
Journal:
arXiv
Published Date:
Jul 23, 2024
Abstract
The age estimation task aims to use facial features to predict the age of
people and is widely used in public security, marketing, identification, and
other fields. However, the features are mainly concentrated in facial
keypoints, and existing CNN and Transformer-based methods have inflexibility
and redundancy for modeling complex irregular structures. Therefore, this paper
proposes a Multi-view Mask Contrastive Learning Graph Convolutional Neural
Network (MMCL-GCN) for age estimation. Specifically, the overall structure of
the MMCL-GCN network contains a feature extraction stage and an age estimation
stage. In the feature extraction stage, we introduce a graph structure to
construct face images as input and then design a Multi-view Mask Contrastive
Learning (MMCL) mechanism to learn complex structural and semantic information
about face images. The learning mechanism employs an asymmetric siamese network
architecture, which utilizes an online encoder-decoder structure to reconstruct
the missing information from the original graph and utilizes the target encoder
to learn latent representations for contrastive learning. Furthermore, to
promote the two learning mechanisms better compatible and complementary, we
adopt two augmentation strategies and optimize the joint losses. In the age
estimation stage, we design a Multi-layer Extreme Learning Machine (ML-IELM)
with identity mapping to fully use the features extracted by the online
encoder. Then, a classifier and a regressor were constructed based on ML-IELM,
which were used to identify the age grouping interval and accurately estimate
the final age. Extensive experiments show that MMCL-GCN can effectively reduce
the error of age estimation on benchmark datasets such as Adience, MORPH-II,
and LAP-2016.