Digging into Depth Priors for Outdoor Neural Radiance Fields
ACM Multimedia 2023
Neural Radiance Fields (NeRF) have demonstrated impressive performance in vision and graphics tasks, such as novel view synthesis and immersive reality. However, the shape-radiance ambiguity of radiance fields remains a challenge, especially in the sparse viewpoints setting. Recent work resorts to integrating depth priors into outdoor NeRF training to alleviate the issue. However, the criteria for selecting depth priors and the relative merits of different priors have not been thoroughly investigated. Moreover, the relative merits of selecting different approaches to use the depth priors is also an unexplored problem. In this paper, we provide a comprehensive study and evaluation of employing depth priors to outdoor neural radiance fields, covering common depth sensing technologies and most application ways. Specifically, we conduct extensive experiments with two representative NeRF methods equipped with four commonly-used depth priors and different depth usages on two widely used outdoor datasets. Our experimental results reveal several interesting findings that can potentially benefit practitioners and researchers in training their NeRF models with depth priors.
Depth Supervision Type
Direct: MSE, L1; Indirect: KL, URF
Kitti (5 sequence) and Argoverse (3 sequence)
NeRF++, MipNeRF-360, Instant-NGP
MFFNet (completion), BTS (monocular), CFNet and PCWNet (stereo)
We apply different methods for novel view synthesis on each sequence with the listed depth priors in two settings: sparse input viewpoints and dense input viewpoints. We evaluate both novel view synthesis quality and depth estimation quality with corresponding metrics. We found depth priors are essential for sparse viewpoints.
Comparison on a sequence with only RGB inputs (top) and mono depth (bottom)
Comparison of different depth priors
Point cloud Visualization
1: Monocular depth is great for sparse view and comes at no cost, it can achieve comparable quality
with ground truth LiDAR depth.
2. Depth supervision is an option for dense view, which increases the geometry quality of NeRFs.
3. The denser the depth, the better quality it will bring.
4. Simple loss function and depth filtering are enough.