A signed distance function (SDF) is a useful representation for continuous-space geometry and many related operations, including rendering, collision checking, and mesh generation. Hence, reconstructing SDF from image observations accurately and efficiently is a fundamental problem. Recently, neural implicit SDF (SDF-NeRF) techniques, trained using volumetric rendering, have gained a lot of attention. Compared to earlier truncated SDF (TSDF) fusion algorithms that rely on depth maps and voxelize continuous space, SDF-NeRF enables continuous-space SDF reconstruction with better geometric and photometric accuracy. However, the accuracy and convergence speed of scene-level SDF reconstruction require further improvements for many applications. With the advent of 3D Gaussian Splatting (3DGS) as an explicit representation with excellent rendering quality and speed, several works have focused on improving SDF-NeRF by introducing consistency losses on depth and surface normals between 3DGS and SDF-NeRF. However, loss-level connections alone lead to incremental improvements. We propose a novel neural implicit SDF called "SplatSDF" to fuse 3DGS and SDF-NeRF at an architecture level with significant boosts to geometric and photometric accuracy and convergence speed. Our SplatSDF relies on 3DGS as input only during training, and keeps the same complexity and efficiency as the original SDF-NeRF during inference. Our method outperforms state-of-the-art SDF-NeRF models on geometric and photometric evaluation by the time of submission.
Our SplatSDF takes posed RGB images and 3DGS to train an SDF-NeRF. We use 3DGS-rendered depths to identify the anchor point and shift the closest query point to the anchor point. With a shared hash encoder, we extract query-point SDF embeddings \(e_{sdf}\) and 3DGS embeddings \(e_{gs}\). Our method applies a 3DGS aggregator to merge the 3DGS attributes: mean \(\mu\), covariance \(\Sigma\), color \(c\), and spherical harmonics \(SH\). We propose a novel surface 3DGS fusion to fuse \(e_{gs}\) and \(e_{sdf}\) only around the anchor point and regress to SDF. With a density function, SDF is converted to per-point density \(\sigma(x)\). We take the geometric features \(g(x)\), the surface normal from SDF \(n(x)\), the query-point coordinates \(x\) and the viewing angle \(v\) to estimate per-point color \(c(x)\) and obtain the per-pixel color \(\hat{C}\) by volumetric rendering to supervise with input images.
Neuralangelo
Ours
@misc{splatsdf,
title={SplatSDF: Boosting Neural Implicit SDF via Gaussian Splatting Fusion},
author={Runfa Blark Li and Keito Suzuki and Bang Du and Ki Myung Brian Lee and Nikolay Atanasov and Truong Nguyen},
year={2024},
eprint={2411.15468},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2411.15468},
}