AGHI-QA: A Subjective-Aligned Dataset and Metric for AI-Generated Human Images
Journal:
arXiv
Published Date:
Apr 30, 2025
Abstract
The rapid development of text-to-image (T2I) generation approaches has
attracted extensive interest in evaluating the quality of generated images,
leading to the development of various quality assessment methods for
general-purpose T2I outputs. However, existing image quality assessment (IQA)
methods are limited to providing global quality scores, failing to deliver
fine-grained perceptual evaluations for structurally complex subjects like
humans, which is a critical challenge considering the frequent anatomical and
textural distortions in AI-generated human images (AGHIs). To address this gap,
we introduce AGHI-QA, the first large-scale benchmark specifically designed for
quality assessment of AGHIs. The dataset comprises 4,000 images generated from
400 carefully crafted text prompts using 10 state of-the-art T2I models. We
conduct a systematic subjective study to collect multidimensional annotations,
including perceptual quality scores, text-image correspondence scores, visible
and distorted body part labels. Based on AGHI-QA, we evaluate the strengths and
weaknesses of current T2I methods in generating human images from multiple
dimensions. Furthermore, we propose AGHI-Assessor, a novel quality metric that
integrates the large multimodal model (LMM) with domain-specific human features
for precise quality prediction and identification of visible and distorted body
parts in AGHIs. Extensive experimental results demonstrate that AGHI-Assessor
showcases state-of-the-art performance, significantly outperforming existing
IQA methods in multidimensional quality assessment and surpassing leading LMMs
in detecting structural distortions in AGHIs.