Muon-Accelerated Attention Distillation for Real-Time Edge Synthesis via Optimized Latent Diffusion

Journal: arXiv

Published Date: Apr 11, 2025

Abstract

Recent advances in visual synthesis have leveraged diffusion models and attention mechanisms to achieve high-fidelity artistic style transfer and photorealistic text-to-image generation. However, real-time deployment on edge devices remains challenging due to computational and memory constraints. We propose Muon-AD, a co-designed framework that integrates the Muon optimizer with attention distillation for real-time edge synthesis. By eliminating gradient conflicts through orthogonal parameter updates and dynamic pruning, Muon-AD achieves 3.2 times faster convergence compared to Stable Diffusion-TensorRT, while maintaining synthesis quality (15% lower FID, 4% higher SSIM). Our framework reduces peak memory to 7GB on Jetson Orin and enables 24FPS real-time generation through mixed-precision quantization and curriculum learning. Extensive experiments on COCO-Stuff and ImageNet-Texture demonstrate Muon-AD's Pareto-optimal efficiency-quality trade-offs. Here, we show a 65% reduction in communication overhead during distributed training and real-time 10s/image generation on edge GPUs. These advancements pave the way for democratizing high-quality visual synthesis in resource-constrained environments.

Authors

Weiye Chen
Qingen Zhu
Qian Long

External Resources

View on arXiv arXiv (http://arxiv.org/abs/2504.08451v1)

Muon-Accelerated Attention Distillation for Real-Time Edge Synthesis via Optimized Latent Diffusion

Abstract

Authors

Categories

External Resources

Popular Topics

Recent Journals

Muon-Accelerated Attention Distillation for Real-Time Edge Synthesis via Optimized Latent Diffusion

Abstract

Authors

Categories

External Resources

Stay Ahead of Medical AI

Popular Topics

Recent Journals