Photoreal Scene Reconstruction from an Egocentric Device
Journal:
arXiv
Published Date:
Jun 4, 2025
Abstract
In this paper, we investigate the challenges associated with using egocentric
devices to photorealistic reconstruct the scene in high dynamic range. Existing
methodologies typically assume using frame-rate 6DoF pose estimated from the
device's visual-inertial odometry system, which may neglect crucial details
necessary for pixel-accurate reconstruction. This study presents two
significant findings. Firstly, in contrast to mainstream work treating RGB
camera as global shutter frame-rate camera, we emphasize the importance of
employing visual-inertial bundle adjustment (VIBA) to calibrate the precise
timestamps and movement of the rolling shutter RGB sensing camera in a high
frequency trajectory format, which ensures an accurate calibration of the
physical properties of the rolling-shutter camera. Secondly, we incorporate a
physical image formation model based into Gaussian Splatting, which effectively
addresses the sensor characteristics, including the rolling-shutter effect of
RGB cameras and the dynamic ranges measured by sensors. Our proposed
formulation is applicable to the widely-used variants of Gaussian Splats
representation. We conduct a comprehensive evaluation of our pipeline using the
open-source Project Aria device under diverse indoor and outdoor lighting
conditions, and further validate it on a Meta Quest3 device. Across all
experiments, we observe a consistent visual enhancement of +1 dB in PSNR by
incorporating VIBA, with an additional +1 dB achieved through our proposed
image formation model. Our complete implementation, evaluation datasets, and
recording profile are available at
http://www.projectaria.com/photoreal-reconstruction/