Stable-Hair v2: Real-World Hair Transfer via Multiple-View Diffusion Model
Journal:
arXiv
Published Date:
Jul 10, 2025
Abstract
While diffusion-based methods have shown impressive capabilities in capturing
diverse and complex hairstyles, their ability to generate consistent and
high-quality multi-view outputs -- crucial for real-world applications such as
digital humans and virtual avatars -- remains underexplored. In this paper, we
propose Stable-Hair v2, a novel diffusion-based multi-view hair transfer
framework. To the best of our knowledge, this is the first work to leverage
multi-view diffusion models for robust, high-fidelity, and view-consistent hair
transfer across multiple perspectives. We introduce a comprehensive multi-view
training data generation pipeline comprising a diffusion-based Bald Converter,
a data-augment inpainting model, and a face-finetuned multi-view diffusion
model to generate high-quality triplet data, including bald images, reference
hairstyles, and view-aligned source-bald pairs. Our multi-view hair transfer
model integrates polar-azimuth embeddings for pose conditioning and temporal
attention layers to ensure smooth transitions between views. To optimize this
model, we design a novel multi-stage training strategy consisting of
pose-controllable latent IdentityNet training, hair extractor training, and
temporal attention training. Extensive experiments demonstrate that our method
accurately transfers detailed and realistic hairstyles to source subjects while
achieving seamless and consistent results across views, significantly
outperforming existing methods and establishing a new benchmark in multi-view
hair transfer. Code is publicly available at
https://github.com/sunkymepro/StableHairV2.