Dynamic View Synthesis as an Inverse Problem

Journal: arXiv

Published Date: Jun 9, 2025

Abstract

In this work, we address dynamic view synthesis from monocular videos as an inverse problem in a training-free setting. By redesigning the noise initialization phase of a pre-trained video diffusion model, we enable high-fidelity dynamic view synthesis without any weight updates or auxiliary modules. We begin by identifying a fundamental obstacle to deterministic inversion arising from zero-terminal signal-to-noise ratio (SNR) schedules and resolve it by introducing a novel noise representation, termed K-order Recursive Noise Representation. We derive a closed form expression for this representation, enabling precise and efficient alignment between the VAE-encoded and the DDIM inverted latents. To synthesize newly visible regions resulting from camera motion, we introduce Stochastic Latent Modulation, which performs visibility aware sampling over the latent space to complete occluded regions. Comprehensive experiments demonstrate that dynamic view synthesis can be effectively performed through structured latent manipulation in the noise initialization phase.

Authors

Hidir Yesiltepe
Pinar Yanardag

External Resources

View on arXiv arXiv (http://arxiv.org/abs/2506.08004v1)

Dynamic View Synthesis as an Inverse Problem

Abstract

Authors

Categories

External Resources

Popular Topics

Recent Journals

Dynamic View Synthesis as an Inverse Problem

Abstract

Authors

Categories

External Resources

Stay Ahead of Medical AI

Popular Topics

Recent Journals