R3D2: Realistic 3D Asset Insertion via Diffusion for Autonomous Driving Simulation
Journal:
arXiv
Published Date:
Jun 9, 2025
Abstract
Validating autonomous driving (AD) systems requires diverse and
safety-critical testing, making photorealistic virtual environments essential.
Traditional simulation platforms, while controllable, are resource-intensive to
scale and often suffer from a domain gap with real-world data. In contrast,
neural reconstruction methods like 3D Gaussian Splatting (3DGS) offer a
scalable solution for creating photorealistic digital twins of real-world
driving scenes. However, they struggle with dynamic object manipulation and
reusability as their per-scene optimization-based methodology tends to result
in incomplete object models with integrated illumination effects. This paper
introduces R3D2, a lightweight, one-step diffusion model designed to overcome
these limitations and enable realistic insertion of complete 3D assets into
existing scenes by generating plausible rendering effects-such as shadows and
consistent lighting-in real time. This is achieved by training R3D2 on a novel
dataset: 3DGS object assets are generated from in-the-wild AD data using an
image-conditioned 3D generative model, and then synthetically placed into
neural rendering-based virtual environments, allowing R3D2 to learn realistic
integration. Quantitative and qualitative evaluations demonstrate that R3D2
significantly enhances the realism of inserted assets, enabling use-cases like
text-to-3D asset insertion and cross-scene/dataset object transfer, allowing
for true scalability in AD validation. To promote further research in scalable
and realistic AD simulation, we will release our dataset and code, see
https://research.zenseact.com/publications/R3D2/.