DCEdit: Dual-Level Controlled Image Editing via Precisely Localized Semantics

Journal: arXiv

Published Date: Mar 21, 2025

Abstract

This paper presents a novel approach to improving text-guided image editing using diffusion-based models. Text-guided image editing task poses key challenge of precisly locate and edit the target semantic, and previous methods fall shorts in this aspect. Our method introduces a Precise Semantic Localization strategy that leverages visual and textual self-attention to enhance the cross-attention map, which can serve as a regional cues to improve editing performance. Then we propose a Dual-Level Control mechanism for incorporating regional cues at both feature and latent levels, offering fine-grained control for more precise edits. To fully compare our methods with other DiT-based approaches, we construct the RW-800 benchmark, featuring high resolution images, long descriptive texts, real-world images, and a new text editing task. Experimental results on the popular PIE-Bench and RW-800 benchmarks demonstrate the superior performance of our approach in preserving background and providing accurate edits.

Authors

Yihan Hu
Jianing Peng
Yiheng Lin
Ting Liu
Xiaochao Qu
Luoqi Liu
Yao Zhao
Yunchao Wei

External Resources

View on arXiv arXiv (http://arxiv.org/abs/2503.16795v1)

DCEdit: Dual-Level Controlled Image Editing via Precisely Localized Semantics

Abstract

Authors

Categories

External Resources

Popular Topics

Recent Journals

DCEdit: Dual-Level Controlled Image Editing via Precisely Localized Semantics

Abstract

Authors

Categories

External Resources

Stay Ahead of Medical AI

Popular Topics

Recent Journals