DASNeRF: depth consistency optimization, adaptive sampling, and hierarchical structural fusion for sparse view neural radiance fields.
Journal:
PloS one
PMID:
40354424
Abstract
To address the challenges of significant detail loss in Neural Radiance Fields (NeRF) under sparse-view input conditions, this paper proposes the DASNeRF framework. DASNeRF aims to generate high-detail novel views from a limited number of input viewpoints. To address the limitations of few-shot NeRF, including insufficient depth information and detail loss, DASNeRF introduces accurate depth priors and employs a depth constraint strategy combining relative depth ordering fidelity regularization and depth structural consistency regularization. These methods ensure reconstruction accuracy even with sparse input views. The depth priors provide high-quality depth data through a more accurate monocular depth estimation model, enhancing the reconstruction capability and stability of the model. The depth ordering fidelity regularization guides the network to learn relative relationships using local depth ranking priors, reducing blurring caused by inaccurate depth estimation. Depth structural consistency regularization maintains global depth consistency by enforcing continuity across neighboring depth pixels. These depth constraint strategies enhance DASNeRF's performance in complex scenes, making 3D reconstruction under sparse views more accurate and natural. In addition, we utilize a three-layer optimal sampling strategy, consisting of coarse sampling, optimized sampling, and fine sampling during the three-layer sampling process to better capture details in key regions. In the optimized sampling phase, the sampling point density in key regions is adaptively increased while reducing sampling in low-priority regions, enhancing detail capture accuracy. To alleviate overfitting, we proposed an MLP structure with per-layer input fusion. This design preserves the model's detail perception ability while effectively avoids overfitting. Specifically, each layer's input includes the output features from the previous layer and incorporates processed five-dimensional information, further enhancing fine detail reconstruction. Experimental results show that DASNeRF outperforms state-of-the-art methods on the LLFF and DTU dataset, achieving better performance in metrics such as PSNR, SSIM, and LPIPS. The reconstructed details and visual quality are significantly improved, demonstrating DASNeRF's potential in 3D reconstruction under sparse-view conditions.