Efficient Portrait Matte Creation With Layer Diffusion and Connectivity Priors
Journal:
arXiv
Published Date:
Jan 27, 2025
Abstract
Learning effective deep portrait matting models requires training data of
both high quality and large quantity. Neither quality nor quantity can be
easily met for portrait matting, however. Since the most accurate ground-truth
portrait mattes are acquired in front of the green screen, it is almost
impossible to harvest a large-scale portrait matting dataset in reality. This
work shows that one can leverage text prompts and the recent Layer Diffusion
model to generate high-quality portrait foregrounds and extract latent portrait
mattes. However, the portrait mattes cannot be readily in use due to
significant generation artifacts. Inspired by the connectivity priors observed
in portrait images, that is, the border of portrait foregrounds always appears
connected, a connectivity-aware approach is introduced to refine portrait
mattes. Building on this, a large-scale portrait matting dataset is created,
termed LD-Portrait-20K, with $20,051$ portrait foregrounds and high-quality
alpha mattes. Extensive experiments demonstrated the value of the
LD-Portrait-20K dataset, with models trained on it significantly outperforming
those trained on other datasets. In addition, comparisons with the chroma
keying algorithm and an ablation study on dataset capacity further confirmed
the effectiveness of the proposed matte creation approach. Further, the dataset
also contributes to state-of-the-art video portrait matting, implemented by
simple video segmentation and a trimap-based image matting model trained on
this dataset.