Freeze-Frame with StaticNeRF Uncertainty-Guided NeRF Map Reconstruction in Dynamic Scenes
IEEE Robotics and Automation Letters (RA-L) 2025

Juhui Lee INHA University dlwngml6635@gmail.com
Geonmo Yang INHA University ygm7422@gmail.com
Seungjun Ma Hyundai Robotics richard7714@hyundai.com
Younggun Cho INHA University yg.cho@inha.ac.kr

StaticNeRF enables high-fidelity dynamic object removal in diverse real-world scenes.

Case A: Opposing camera motion to object movement
Input
Ours
NeRF-W
Case B: Dominant camera motion, minor object movement
Input
Ours
WildGaussians
Case C: Persistent crowd
Input
Ours
NeRF On-the-go

Abstract

Recent advances in neural representations have shown great promise for enabling high-fidelity dense mapping in robotics. Given the inherently dynamic nature of real-world environments, many studies have focused on learning static scene representations from dynamic observations. However, existing methods often fail to remove subtly moving objects and struggle to recover occluded static backgrounds, leading to critical limitations in practice. Furthermore, when static neural maps are used for localization, dynamic content in query images must be handled effectively. To overcome these challenges, we propose a static neural mapping framework that is robust to diverse dynamic environments and capable of processing dynamic content during localization. We evaluate our approach through extensive experiments on both public and in-house datasets. Our method improves both dynamic object removal and localization robustness under dynamic conditions, representing a significant step toward resilient robot navigation in real-world environments.

WildGaussians overview
Overview of our proposed static neural rendering pipeline. The top row illustrates stage-wise scene evolution during training, and the bottom row presents the corresponding method components aligned with each stage. Our framework follows a curriculum learning strategy consisting of three functional stages: (1) Initial Training involves learning coarse geometric and photometric representations through uniform sampling. (2) Uncertainty Compensation incorporates a CNN-based uncertainty network to address ambiguities that NeRF alone cannot resolve, enabling better disentanglement of static and transient fields through joint optimization. (3) High-fidelity Rendering adopts a data-driven sampling strategy and focuses on high-quality rendering.

Static Neural Representation

We compared our method against baseline methods and demonstrated that our approach reliably removes dynamic objects while maintaining high rendering quality.

OursNeRF-W
OursWildGaussians
OursNeRF On-the-go

In addition, we observed that our method performs consistently well across a variety of datasets.

InputOurs
InputOurs
InputOurs

We simulated low-light environments through pixel scaling during preprocessing, and trained an appearance embedding to capture illumination variations. Based on this design, we demonstrated that our method achieves robust dynamic object removal even under diverse lighting conditions.

Normal
Low Light
Darkest

Acknowledgements

We sincerely thank the creators of the Replica, On-the-go, Wild-SLAM, and Bonn datasets for providing high-quality resources that made our evaluations possible. We also acknowledge the authors of NeRF-W and the other baseline methods for their great work and for openly releasing their code, which significantly supported our study. Finally, we would like to express our gratitude to the SPARO lab members for their support and effort during the collection of our own dataset. ❤️