From Cradle to Cane: A Two-Pass Framework for High-Fidelity Lifespan Face Aging
Journal:
arXiv
Published Date:
Jun 26, 2025
Abstract
Face aging has become a crucial task in computer vision, with applications
ranging from entertainment to healthcare. However, existing methods struggle
with achieving a realistic and seamless transformation across the entire
lifespan, especially when handling large age gaps or extreme head poses. The
core challenge lies in balancing age accuracy and identity preservation--what
we refer to as the Age-ID trade-off. Most prior methods either prioritize age
transformation at the expense of identity consistency or vice versa. In this
work, we address this issue by proposing a two-pass face aging framework, named
Cradle2Cane, based on few-step text-to-image (T2I) diffusion models. The first
pass focuses on solving age accuracy by introducing an adaptive noise injection
(AdaNI) mechanism. This mechanism is guided by including prompt descriptions of
age and gender for the given person as the textual condition. Also, by
adjusting the noise level, we can control the strength of aging while allowing
more flexibility in transforming the face. However, identity preservation is
weakly ensured here to facilitate stronger age transformations. In the second
pass, we enhance identity preservation while maintaining age-specific features
by conditioning the model on two identity-aware embeddings (IDEmb): SVR-ArcFace
and Rotate-CLIP. This pass allows for denoising the transformed image from the
first pass, ensuring stronger identity preservation without compromising the
aging accuracy. Both passes are jointly trained in an end-to-end way. Extensive
experiments on the CelebA-HQ test dataset, evaluated through Face++ and Qwen-VL
protocols, show that our Cradle2Cane outperforms existing face aging methods in
age accuracy and identity consistency.