DASNeRF: depth consistency optimization, adaptive sampling, and hierarchical structural fusion for sparse view neural radiance fields.

Journal: PloS one
PMID:

Abstract

To address the challenges of significant detail loss in Neural Radiance Fields (NeRF) under sparse-view input conditions, this paper proposes the DASNeRF framework. DASNeRF aims to generate high-detail novel views from a limited number of input viewpoints. To address the limitations of few-shot NeRF, including insufficient depth information and detail loss, DASNeRF introduces accurate depth priors and employs a depth constraint strategy combining relative depth ordering fidelity regularization and depth structural consistency regularization. These methods ensure reconstruction accuracy even with sparse input views. The depth priors provide high-quality depth data through a more accurate monocular depth estimation model, enhancing the reconstruction capability and stability of the model. The depth ordering fidelity regularization guides the network to learn relative relationships using local depth ranking priors, reducing blurring caused by inaccurate depth estimation. Depth structural consistency regularization maintains global depth consistency by enforcing continuity across neighboring depth pixels. These depth constraint strategies enhance DASNeRF's performance in complex scenes, making 3D reconstruction under sparse views more accurate and natural. In addition, we utilize a three-layer optimal sampling strategy, consisting of coarse sampling, optimized sampling, and fine sampling during the three-layer sampling process to better capture details in key regions. In the optimized sampling phase, the sampling point density in key regions is adaptively increased while reducing sampling in low-priority regions, enhancing detail capture accuracy. To alleviate overfitting, we proposed an MLP structure with per-layer input fusion. This design preserves the model's detail perception ability while effectively avoids overfitting. Specifically, each layer's input includes the output features from the previous layer and incorporates processed five-dimensional information, further enhancing fine detail reconstruction. Experimental results show that DASNeRF outperforms state-of-the-art methods on the LLFF and DTU dataset, achieving better performance in metrics such as PSNR, SSIM, and LPIPS. The reconstructed details and visual quality are significantly improved, demonstrating DASNeRF's potential in 3D reconstruction under sparse-view conditions.

Authors

  • Yongshuo Zhang
    School of Mechanical and Ocean Engineering, Jiangsu Ocean University, Lianyungang, 222005, China.
  • Guangyuan Zhang
    School of Information Science and Electric Engineering, Shandong Jiaotong University, Jinan, 250357, China.
  • Kefeng Li
    School of Medicine University of California San Diego CA 92093 USA.
  • Zhenfang Zhu
    School of Information Science and Electrical Engineering, Shandong Jiaotong University, Jinan 250357, China.
  • Peng Wang
    Neuroengineering Laboratory, School of Biomedical Engineering and Technology, Tianjin Medical University, Tianjin, China.
  • Zhenfei Wang
    School of Information Engineering, Zhengzhou University, Zhengzhou 450001, China. Electronic address: iezfwang@zzu.edu.cn.
  • Chen Fu
    Institute of Modern Biopharmaceuticals, College of Pharmaceutical Sciences, Southwest University, Chongqing 400715, China. fuchen0794@swu.edu.cn.
  • Xiaotong Li
    School of Energy and Environment, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong.
  • Zhiming Fan
    School of Information Science and Electrical Engineering, Shandong Jiaotong University, Jinan, Shandong, China.
  • Yongpeng Zhao
    College of Mechanical and Electrical Engineering, Sichuan Agricultural University, Xin Kang Road, Yucheng District, Ya'an 625014, PR China. Electronic address: 14788@sicau.edu.cn.