Facial Attractiveness Prediction in Live Streaming: A New Benchmark and Multi-modal Method
Journal:
arXiv
Published Date:
Jan 5, 2025
Abstract
Facial attractiveness prediction (FAP) has long been an important computer
vision task, which could be widely applied in live streaming for facial
retouching, content recommendation, etc. However, previous FAP datasets are
either small, closed-source, or lack diversity. Moreover, the corresponding FAP
models exhibit limited generalization and adaptation ability. To overcome these
limitations, in this paper we present LiveBeauty, the first large-scale
live-specific FAP dataset, in a more challenging application scenario, i.e.,
live streaming. 10,000 face images are collected from a live streaming platform
directly, with 200,000 corresponding attractiveness annotations obtained from a
well-devised subjective experiment, making LiveBeauty the largest open-access
FAP dataset in the challenging live scenario. Furthermore, a multi-modal FAP
method is proposed to measure the facial attractiveness in live streaming.
Specifically, we first extract holistic facial prior knowledge and multi-modal
aesthetic semantic features via a Personalized Attractiveness Prior Module
(PAPM) and a Multi-modal Attractiveness Encoder Module (MAEM), respectively,
then integrate the extracted features through a Cross-Modal Fusion Module
(CMFM). Extensive experiments conducted on both LiveBeauty and other
open-source FAP datasets demonstrate that our proposed method achieves
state-of-the-art performance. Dataset will be available soon.