FSDM: An efficient video super-resolution method based on Frames-Shift Diffusion Model.
Journal:
Neural networks : the official journal of the International Neural Network Society
Published Date:
Apr 3, 2025
Abstract
Video super-resolution is a fundamental task aimed at enhancing video quality through intricate modeling techniques. Recent advancements in diffusion models have significantly enhanced image super-resolution processing capabilities. However, their integration into video super-resolution workflows remains constrained due to the computational complexity of temporal fusion modules, demanding more computational resources compared to their image counterparts. To address this challenge, we propose a novel approach: a Frames-Shift Diffusion Model based on the image diffusion models. Compared to directly training diffusion-based video super-resolution models, redesigning the diffusion process of image models without introducing complex temporal modules requires minimal training consumption. We incorporate temporal information into the image super-resolution diffusion model by using optical flow and perform multi-frame fusion. This model adapts the diffusion process to smoothly transition from image super-resolution to video super-resolution diffusion without additional weight parameters. As a result, the Frames-Shift Diffusion Model efficiently processes videos frame by frame while maintaining computational efficiency and achieving superior performance. It enhances perceptual quality and achieves comparable performance to other state-of-the-art diffusion-based VSR methods in PSNR and SSIM. This approach optimizes video super-resolution by simplifying the integration of temporal data, thus addressing key challenges in the field.