Bias for Action: Video Implicit Neural Representations with Bias Modulation
Journal:
arXiv
Published Date:
Jan 16, 2025
Abstract
We propose a new continuous video modeling framework based on implicit neural
representations (INRs) called ActINR. At the core of our approach is the
observation that INRs can be considered as a learnable dictionary, with the
shapes of the basis functions governed by the weights of the INR, and their
locations governed by the biases. Given compact non-linear activation
functions, we hypothesize that an INR's biases are suitable to capture motion
across images, and facilitate compact representations for video sequences.
Using these observations, we design ActINR to share INR weights across frames
of a video sequence, while using unique biases for each frame. We further model
the biases as the output of a separate INR conditioned on time index to promote
smoothness. By training the video INR and this bias INR together, we
demonstrate unique capabilities, including $10\times$ video slow motion,
$4\times$ spatial super resolution along with $2\times$ slow motion, denoising,
and video inpainting. ActINR performs remarkably well across numerous video
processing tasks (often achieving more than 6dB improvement), setting a new
standard for continuous modeling of videos.