MP-Mat: A 3D-and-Instance-Aware Human Matting and Editing Framework with Multiplane Representation
Journal:
arXiv
Published Date:
Apr 20, 2025
Abstract
Human instance matting aims to estimate an alpha matte for each human
instance in an image, which is challenging as it easily fails in complex cases
requiring disentangling mingled pixels belonging to multiple instances along
hairy and thin boundary structures. In this work, we address this by
introducing MP-Mat, a novel 3D-and-instance-aware matting framework with
multiplane representation, where the multiplane concept is designed from two
different perspectives: scene geometry level and instance level. Specifically,
we first build feature-level multiplane representations to split the scene into
multiple planes based on depth differences. This approach makes the scene
representation 3D-aware, and can serve as an effective clue for splitting
instances in different 3D positions, thereby improving interpretability and
boundary handling ability especially in occlusion areas. Then, we introduce
another multiplane representation that splits the scene in an instance-level
perspective, and represents each instance with both matte and color. We also
treat background as a special instance, which is often overlooked by existing
methods. Such an instance-level representation facilitates both foreground and
background content awareness, and is useful for other down-stream tasks like
image editing. Once built, the representation can be reused to realize
controllable instance-level image editing with high efficiency. Extensive
experiments validate the clear advantage of MP-Mat in matting task. We also
demonstrate its superiority in image editing tasks, an area under-explored by
existing matting-focused methods, where our approach under zero-shot inference
even outperforms trained specialized image editing techniques by large margins.
Code is open-sourced at https://github.com/JiaoSiyi/MPMat.git}.