NeRF Editing and Inpainting Techniques: Experiments and Qualitative results

4. Experiments

For our experiments, we select from dynamic scenes in the Nvidia Dynamic Scenes Dataset [35]. Scenes in this dataset are captured using a sparse set of 12 stationary cameras located in two rows, producing images of resolution 1015×1920. The static scenes we use are taken from one frame from the dynamic scenes. For the backbone NeRF, we use static and dynamic versions of K-Planes [25] implemented in nerfstudio [30]. For each scene, we conduct inpainting by replacing a foreground object with another text-prompted object with a different geometry. We will demonstrate the effectiveness of our method by showing the qualitative intermediate and final results. In addition, we will explain different parts of our design by ablations and comparisons on our baseline.

4.1. Qualitative results

3D Examples. We show several 3D inpainting examples in figure 2. For each individual inpainting task, we show 2 renderings of the final NeRF from different views to demonstrate the multiview consistency. Additionally, we show the first seed image, another pre-processed image, as well as the RGB and depth map in the three stages: before training, after warmup training, and after convergence. These beforeand-after images demonstrate the efficacy of each stage in our method. As shown in Figure 2, a roughly consistent preprocessed image can optimize a coarse inpainted NeRF after warmup training, and the geometry (represented by depth map) converges during warmup training. Then, fine convergence across views is achieved after the final training stage. All 3D inpainting tasks are trained on a single Nvidia RTX 4090 GPU. Warmup training takes approximately 0.5–1 hour, and the main training stage with IDU takes approximately 1–2 hours.

4D Example. We show a 4D inpainting example in figure 3 to demonstrate that our method has the potential to generalize to dynamic NeRFs. In this example, we remove the foreground object in the video of the seed view using E2FGVI [11], a flow-based method with optimization by feature propagation and content hallucination. For transferring motion to the generated object, after key point tracking, we estimate a rigid transformation between the key points, and propagate the pixels along the transformation. This dynamic scene consists of 16 frames, in which the first frame includes the first seed image. As shown in the figures, we successfully obtained an overall convergence on the generated object with correct motion for all the illustrated frames.

NeRF Editing and Inpainting Techniques: Experiments and Qualitative results | HackerNoon

Table of Links

4. Experiments

4.1. Qualitative results

Google’s Contracts Harm Competition In The General Search Services Market | HackerNoon

10 of the most exciting digital health startups of 2024, according to VCs | TechCrunch

Spotify Gets Ready To Crank Up The Audio Quality But It Will Cost You

Supercharge Your Portfolio with Future Tech Stocks!

Join us for Profitable Insights & Expert Tips!

NeRF Editing and Inpainting Techniques: Experiments and Qualitative results | HackerNoon

Table of Links

4. Experiments

4.1. Qualitative results

Google’s Contracts Harm Competition In The General Search Services Market | HackerNoon

10 of the most exciting digital health startups of 2024, according to VCs | TechCrunch

Spotify Gets Ready To Crank Up The Audio Quality But It Will Cost You

Supercharge Your Portfolio with Future Tech Stocks!

Join us for Profitable Insights & Expert Tips!

Subscribe