Digging into Depth Priors for Outdoor Neural Radiance Fields
ACM Multimedia 2023
- Chen Wang1,2
- Jaidai Sun2,4
- Lina Liu3,2
- Chenming Wu2
- Zhelun Shen2
- Dayan Wu5
- Yuchao Dai4
- Liangjun Zhang2
- 1Tsinghua University
- 2Baidu Research
- 3Zhejiang University
- 4Northwestern Polytechnical University
- 5Chinese Academy of Sciences
Abstract
Neural Radiance Fields (NeRF) have demonstrated impressive performance in vision and graphics tasks, such as novel view synthesis and immersive reality. However, the shape-radiance ambiguity of radiance fields remains a challenge, especially in the sparse viewpoints setting. Recent work resorts to integrating depth priors into outdoor NeRF training to alleviate the issue. However, the criteria for selecting depth priors and the relative merits of different priors have not been thoroughly investigated. Moreover, the relative merits of selecting different approaches to use the depth priors is also an unexplored problem. In this paper, we provide a comprehensive study and evaluation of employing depth priors to outdoor neural radiance fields, covering common depth sensing technologies and most application ways. Specifically, we conduct extensive experiments with two representative NeRF methods equipped with four commonly-used depth priors and different depth usages on two widely used outdoor datasets. Our experimental results reveal several interesting findings that can potentially benefit practitioners and researchers in training their NeRF models with depth priors.
Experiment Setup
Depth Priors
Depth Supervision Type
Direct: MSE, L1; Indirect: KL, URF
Datasets
Kitti (5 sequence) and Argoverse (3 sequence)
NeRF Methods
NeRF++, MipNeRF-360, Instant-NGP
Depth Methods
MFFNet (completion), BTS (monocular), CFNet and PCWNet (stereo)
Procedure
We apply different methods for novel view synthesis on each sequence with the listed depth priors in two settings: sparse input viewpoints and dense input viewpoints. We evaluate both novel view synthesis quality and depth estimation quality with corresponding metrics. We found depth priors are essential for sparse viewpoints.
Experiment Results
Comparison on a sequence with only RGB inputs (top) and mono depth (bottom)
Comparison of different depth priors
Point cloud Visualization
Findings
-
1: Monocular depth is great for sparse view and comes at no cost, it can achieve comparable quality
with ground truth LiDAR depth.
-
2. Depth supervision is an option for dense view, which increases the geometry quality of NeRFs.
-
3. The denser the depth, the better quality it will bring.
-
4. Simple loss function and depth filtering are enough.
Citation
Acknowledgement
The website template was borrowed from Michaƫl Gharbi and Ben Mildenhall.