PainterNet: Adaptive Image Inpainting with Actual-Token Attention and Diverse Mask Control

Journal: arXiv

Published Date: Dec 2, 2024

Abstract

Recently, diffusion models have exhibited superior performance in the area of image inpainting. Inpainting methods based on diffusion models can usually generate realistic, high-quality image content for masked areas. However, due to the limitations of diffusion models, existing methods typically encounter problems in terms of semantic consistency between images and text, and the editing habits of users. To address these issues, we present PainterNet, a plugin that can be flexibly embedded into various diffusion models. To generate image content in the masked areas that highly aligns with the user input prompt, we proposed local prompt input, Attention Control Points (ACP), and Actual-Token Attention Loss (ATAL) to enhance the model's focus on local areas. Additionally, we redesigned the MASK generation algorithm in training and testing dataset to simulate the user's habit of applying MASK, and introduced a customized new training dataset, PainterData, and a benchmark dataset, PainterBench. Our extensive experimental analysis exhibits that PainterNet surpasses existing state-of-the-art models in key metrics including image quality and global/local text consistency.

Authors

Ruichen Wang
Junliang Zhang
Qingsong Xie
Chen Chen
Haonan Lu

External Resources

View on arXiv arXiv (http://arxiv.org/abs/2412.01223v1)

PainterNet: Adaptive Image Inpainting with Actual-Token Attention and Diverse Mask Control

Abstract

Authors

Categories

External Resources

Popular Topics

Recent Journals

PainterNet: Adaptive Image Inpainting with Actual-Token Attention and Diverse Mask Control

Abstract

Authors

Categories

External Resources

Stay Ahead of Medical AI

Popular Topics

Recent Journals