Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models

Journal: arXiv

Published Date: Mar 26, 2025

Abstract

Recent advancements in autoregressive and diffusion models have led to strong performance in image generation with short scene text words. However, generating coherent, long-form text in images, such as paragraphs in slides or documents, remains a major challenge for current generative models. We present the first work specifically focused on long text image generation, addressing a critical gap in existing text-to-image systems that typically handle only brief phrases or single sentences. Through comprehensive analysis of state-of-the-art autoregressive generation models, we identify the image tokenizer as a critical bottleneck in text generating quality. To address this, we introduce a novel text-focused, binary tokenizer optimized for capturing detailed scene text features. Leveraging our tokenizer, we develop \ModelName, a multimodal autoregressive model that excels in generating high-quality long-text images with unprecedented fidelity. Our model offers robust controllability, enabling customization of text properties such as font style, size, color, and alignment. Extensive experiments demonstrate that \ModelName~significantly outperforms SD3.5 Large~\cite{sd3} and GPT4o~\cite{gpt4o} with DALL-E 3~\cite{dalle3} in generating long text accurately, consistently, and flexibly. Beyond its technical achievements, \ModelName~opens up exciting opportunities for innovative applications like interleaved document and PowerPoint generation, establishing a new frontier in long-text image generating.

Authors

Alex Jinpeng Wang
Linjie Li
Zhengyuan Yang
Lijuan Wang
Min Li

External Resources

View on arXiv arXiv (http://arxiv.org/abs/2503.20198v1)

Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models

Abstract

Authors

Categories

External Resources

Popular Topics

Recent Journals

Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models

Abstract

Authors

Categories

External Resources

Don't Miss the Future of Medicine

Popular Topics

Recent Journals