Mask Image Watermarking
Journal:
arXiv
Published Date:
Apr 17, 2025
Abstract
We present MaskMark, a simple, efficient, and flexible framework for image
watermarking. MaskMark has two variants: (1) MaskMark-D, which supports global
watermark embedding, watermark localization, and local watermark extraction for
applications such as tamper detection; (2) MaskMark-ED, which focuses on local
watermark embedding and extraction, offering enhanced robustness in small
regions to support fine-grined image protection. MaskMark-D builds on the
classical encoder-distortion layer-decoder training paradigm. In MaskMark-D, we
introduce a simple masking mechanism during the decoding stage that enables
both global and local watermark extraction. During training, the decoder is
guided by various types of masks applied to watermarked images before
extraction, helping it learn to localize watermarks and extract them from the
corresponding local areas. MaskMark-ED extends this design by incorporating the
mask into the encoding stage as well, guiding the encoder to embed the
watermark in designated local regions, which improves robustness under regional
attacks. Extensive experiments show that MaskMark achieves state-of-the-art
performance in global and local watermark extraction, watermark localization,
and multi-watermark embedding. It outperforms all existing baselines, including
the recent leading model WAM for local watermarking, while preserving high
visual quality of the watermarked images. In addition, MaskMark is highly
efficient and adaptable. It requires only 20 hours of training on a single
A6000 GPU, achieving 15x computational efficiency compared to WAM. By simply
adjusting the distortion layer, MaskMark can be quickly fine-tuned to meet
varying robustness requirements.