search
HomeTechnology peripheralsAINew research from NeRF is here: 3D scenes are tracelessly removed without objects, accurate to hair

Neural Radiation Fields (NeRF) have become a popular new view synthesis method. Although NeRF is rapidly generalizing to a wider range of applications and datasets, directly editing NeRF modeling scenarios remains a huge challenge. An important task is to remove unwanted objects from a 3D scene and maintain consistency with its surrounding scene, this task is called 3D image inpainting. In 3D, solutions must be consistent across multiple views and be geometrically valid.

In this article, researchers from Samsung, University of Toronto and other institutions propose a new 3D inpainting method to solve these challenges, given a small set of pose images and sparseness in a single input image Note, the proposed model framework first quickly obtains the three-dimensional segmentation mask of the target object and uses the mask, and then introduces a method based on perceptual optimization, which uses the learned two-dimensional images to repair and extract their information. to three-dimensional space while ensuring view consistency.

The research also brings a new benchmark for evaluating 3D in-scene inpainting methods by training on a challenging real-life scene dataset. In particular, this dataset contains views of the same scene with and without target objects, enabling more principled benchmarking of inpainting tasks in 3D space.

New research from NeRF is here: 3D scenes are tracelessly removed without objects, accurate to hair

  • Paper address: https://arxiv.org/pdf/2211.12254.pdf
  • Paper homepage: https://spinnerf3d.github.io/

The following is The effect shows that after removing some objects, it can still maintain consistency with its surrounding scene:

New research from NeRF is here: 3D scenes are tracelessly removed without objects, accurate to hair

Comparison between the method in this article and other methods, other methods There are obvious artifacts, but the method in this article is not very obvious:

New research from NeRF is here: 3D scenes are tracelessly removed without objects, accurate to hair

Method introduction

The author uses an integrated method To cope with various challenges in 3D scene editing tasks, this method obtains multi-view images of the scene, uses user input to extract the 3D mask, and uses NeRF training to fit it into the mask image, so that the target object is reasonably Three-dimensional appearance and geometric shapes replaced. Existing interactive 2D segmentation methods do not consider the 3D aspect, and current NeRF-based methods cannot obtain good results using sparse annotations and do not achieve sufficient accuracy. While some current NeRF-based algorithms allow object removal, they do not attempt to provide newly generated parts of space. According to current research progress, this work is the first to simultaneously handle interactive multi-view segmentation and complete 3D image restoration in a single framework.

Researchers utilize off-the-shelf, 3D-free models for segmentation and image restoration, and transfer their output to 3D space in a view-consistent manner. Building on work on 2D interactive segmentation, the proposed model starts from a small number of user-calibrated image points with the mouse on a target object. From this, their algorithm initializes the mask with a video-based model and trains it into a coherent 3D segmentation by fitting the NeRF of a semantic mask. Then, the pre-trained 2D image restoration is applied to the multi-view image set. The NeRF fitting process is used to reconstruct the 3D image scene, using perceptual loss to constrain the inconsistency of the 2D image, and the geometry of the normalized mask of the depth image. area. Overall, we provide a complete approach, from object selection to new view synthesis of embedded scenes, in a unified framework with minimal burden on the user, as shown in the figure below.

New research from NeRF is here: 3D scenes are tracelessly removed without objects, accurate to hair

In summary, the contributions of this work are as follows:

  • A complete 3D scene operation process, starting from user interaction object selection and ending with the 3D repaired NeRF scene;
  • The two-dimensional segmentation The model is extended to multi-view situations and can recover 3D consistent masks from sparse annotations;
  • ensures view consistency and perceptual plausibility, a new optimization-based 3D Repair formula, using 2D image repair;
  • A new data set for 3D editing task evaluation, including the corresponding post-operation Groud Truth.

Specific to the method, this study first describes how to initialize a rough 3D mask from single-view annotations. Denote the annotated source code view as I_1. Feed sparse information about objects and source views to an interactive segmentation model, which is used to estimate the initial source object mask New research from NeRF is here: 3D scenes are tracelessly removed without objects, accurate to hair. The training views are then treated as a video sequence, along with New research from NeRF is here: 3D scenes are tracelessly removed without objects, accurate to hair given a video instance segmentation model V to compute New research from NeRF is here: 3D scenes are tracelessly removed without objects, accurate to hair, where New research from NeRF is here: 3D scenes are tracelessly removed without objects, accurate to hair is the initial guess of the object mask for I_i. The initial masks are often inaccurate near boundaries because the training views are not actually adjacent video frames, and video segmentation models are often 3D unknown.

New research from NeRF is here: 3D scenes are tracelessly removed without objects, accurate to hair

The multi-view segmentation module obtains the input RGB image, the corresponding camera intrinsic and extrinsic parameters, and the initial mask to Train a semantic NeRF. The above diagram depicts the network used in semantic NeRF; for a point x and a view directory d, in addition to the density σ and color c, it returns a pre-sigmoid object logit, s (x). For its fast convergence, the researchers used instant-NGP as their NeRF architecture. The desired objectivity associated with a ray r is obtained by presenting in the equation the logarithm of the points on r rather than their color relative to the density:

New research from NeRF is here: 3D scenes are tracelessly removed without objects, accurate to hair

The classification loss is then used for supervision:

New research from NeRF is here: 3D scenes are tracelessly removed without objects, accurate to hair

is used for supervision based on The overall loss of NeRF's multi-view segmentation model is:

New research from NeRF is here: 3D scenes are tracelessly removed without objects, accurate to hair

##Finally, two stages are used for optimization to further improve the mask. code; after obtaining the initial 3D mask, the mask is rendered from the training view and used to supervise the secondary multi-view segmentation model as the initial hypothesis (instead of the video segmentation output).

New research from NeRF is here: 3D scenes are tracelessly removed without objects, accurate to hair

#The image above shows an overview of the view-consistent fix. As lack of data prevents direct training of 3D modified inpainting models, this study leverages existing 2D inpainting models to obtain depth and appearance priors and then supervises NeRF rendering fitting to the complete scene. This embedding NeRF is trained using the following loss:

New research from NeRF is here: 3D scenes are tracelessly removed without objects, accurate to hair#This study proposes a repair method with view consistency, and the input is RGB. First, the study transfers image and mask pairs to an image inpainter to obtain an RGB image. Since each view is repaired independently, the repaired views are directly used to supervise the NeRF reconstruction. In this paper, instead of using mean square error (MSE) as the loss to generate masks, the researchers propose to use perceptual loss LPIPS to optimize the masked part of the image, while still using MSE to optimize the unmasked part. This loss is calculated as follows:

New research from NeRF is here: 3D scenes are tracelessly removed without objects, accurate to hair

Even with the perceptual loss, repairing the difference between views will lead incorrectly The model converges to low-quality geometry (e.g., "blurred" geometry measurements may form near the camera to account for different information from each view). Therefore, the researchers used the generated depth map as additional guidance for the NeRF model and separated the weights when calculating the perceptual loss, using the perceptual loss to fit only the color of the scene. To do this, we used a NeRF optimized for images containing unwanted objects and rendered depth maps corresponding to the training views. The calculation method is to use the distance to the camera instead of the color of the point:

New research from NeRF is here: 3D scenes are tracelessly removed without objects, accurate to hair

Then The rendered depth is input to the inpainter model to obtain the inpainted depth map. Research has found that using LaMa for depth rendering, such as RGB, can yield sufficiently high-quality results. This NeRF can be the same model used for multi-view segmentation, if other sources are used to obtain the masks, such as human annotated masks, a new NeRF will be installed into the scene. These depth maps are then used to oversee the geometry of the inpainted NeRF, through which the rendered depth is then fed into the inpainter model to obtain the inpainted depth map. Research has found that using LaMa for depth rendering, such as RGB, can yield sufficiently high-quality results. This NeRF can be the same model used for multi-view segmentation, if other sources are used to obtain the masks, such as human annotated masks, a new NeRF will be installed into the scene. These depth maps are then used to supervise the geometry of the inpainted NeRF by its rendering depth to the inpainted depth by a distance of New research from NeRF is here: 3D scenes are tracelessly removed without objects, accurate to hair to the inpainted depth:


New research from NeRF is here: 3D scenes are tracelessly removed without objects, accurate to hair

##Experimental results

Multi-view segmentation: First evaluate the MVSeg model without any editing fixes. In this experiment, it is assumed that sparse image points have been given a ready-made interactive segmentation model and source masks are available. So the task is to transfer the source mask into other views. The table below shows that the new model outperforms the 2D (3D inconsistent) and 3D baselines. In addition, the two-stage optimization proposed by the researchers helps to further improve the resulting mask.

New research from NeRF is here: 3D scenes are tracelessly removed without objects, accurate to hair Qualitatively speaking, the figure below compares the results of the researchers’ segmentation model with the output of NVOS and some video segmentation methods. Compare. Their model reduces noise and improves view consistency compared to the thick edges of 3D video segmentation models. Although NVOS uses scribbles instead of the sparse points used in the researchers' new model, the new model's MVSeg is visually superior to NVOS. Since the NVOS codebase is not available, the researchers replicated published qualitative results on NVOS (see the supplementary document for more examples).

New research from NeRF is here: 3D scenes are tracelessly removed without objects, accurate to hair

The following table shows the comparison of the MV method with the baseline. Overall, the newly proposed method significantly outperforms other 2D and 3D repair methods. The table below further shows that removing guidance from geometric structures degrades the quality of the repaired scene.

New research from NeRF is here: 3D scenes are tracelessly removed without objects, accurate to hair

Qualitative results are shown in Figure 6 and Figure 7. Figure 6 shows that our method can reconstruct view-consistent scenes with detailed textures, including coherent views of glossy and matte surfaces. Figure 7 shows that our perceptual method reduces the exact reconstruction constraints of the mask region, thereby preventing the appearance of blur when using all images, while also avoiding artifacts caused by single-view supervision.

New research from NeRF is here: 3D scenes are tracelessly removed without objects, accurate to hair

New research from NeRF is here: 3D scenes are tracelessly removed without objects, accurate to hair


##

The above is the detailed content of New research from NeRF is here: 3D scenes are tracelessly removed without objects, accurate to hair. For more information, please follow other related articles on the PHP Chinese website!

Statement
This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete
How to Run LLM Locally Using LM Studio? - Analytics VidhyaHow to Run LLM Locally Using LM Studio? - Analytics VidhyaApr 19, 2025 am 11:38 AM

Running large language models at home with ease: LM Studio User Guide In recent years, advances in software and hardware have made it possible to run large language models (LLMs) on personal computers. LM Studio is an excellent tool to make this process easy and convenient. This article will dive into how to run LLM locally using LM Studio, covering key steps, potential challenges, and the benefits of having LLM locally. Whether you are a tech enthusiast or are curious about the latest AI technologies, this guide will provide valuable insights and practical tips. Let's get started! Overview Understand the basic requirements for running LLM locally. Set up LM Studi on your computer

Guy Peri Helps Flavor McCormick's Future Through Data TransformationGuy Peri Helps Flavor McCormick's Future Through Data TransformationApr 19, 2025 am 11:35 AM

Guy Peri is McCormick’s Chief Information and Digital Officer. Though only seven months into his role, Peri is rapidly advancing a comprehensive transformation of the company’s digital capabilities. His career-long focus on data and analytics informs

What is the Chain of Emotion in Prompt Engineering? - Analytics VidhyaWhat is the Chain of Emotion in Prompt Engineering? - Analytics VidhyaApr 19, 2025 am 11:33 AM

Introduction Artificial intelligence (AI) is evolving to understand not just words, but also emotions, responding with a human touch. This sophisticated interaction is crucial in the rapidly advancing field of AI and natural language processing. Th

12 Best AI Tools for Data Science Workflow - Analytics Vidhya12 Best AI Tools for Data Science Workflow - Analytics VidhyaApr 19, 2025 am 11:31 AM

Introduction In today's data-centric world, leveraging advanced AI technologies is crucial for businesses seeking a competitive edge and enhanced efficiency. A range of powerful tools empowers data scientists, analysts, and developers to build, depl

AV Byte: OpenAI's GPT-4o Mini and Other AI InnovationsAV Byte: OpenAI's GPT-4o Mini and Other AI InnovationsApr 19, 2025 am 11:30 AM

This week's AI landscape exploded with groundbreaking releases from industry giants like OpenAI, Mistral AI, NVIDIA, DeepSeek, and Hugging Face. These new models promise increased power, affordability, and accessibility, fueled by advancements in tr

Perplexity's Android App Is Infested With Security Flaws, Report FindsPerplexity's Android App Is Infested With Security Flaws, Report FindsApr 19, 2025 am 11:24 AM

But the company’s Android app, which offers not only search capabilities but also acts as an AI assistant, is riddled with a host of security issues that could expose its users to data theft, account takeovers and impersonation attacks from malicious

Everyone's Getting Better At Using AI: Thoughts On Vibe CodingEveryone's Getting Better At Using AI: Thoughts On Vibe CodingApr 19, 2025 am 11:17 AM

You can look at what’s happening in conferences and at trade shows. You can ask engineers what they’re doing, or consult with a CEO. Everywhere you look, things are changing at breakneck speed. Engineers, and Non-Engineers What’s the difference be

Rocket Launch Simulation and Analysis using RocketPy - Analytics VidhyaRocket Launch Simulation and Analysis using RocketPy - Analytics VidhyaApr 19, 2025 am 11:12 AM

Simulate Rocket Launches with RocketPy: A Comprehensive Guide This article guides you through simulating high-power rocket launches using RocketPy, a powerful Python library. We'll cover everything from defining rocket components to analyzing simula

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Tools

SecLists

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

WebStorm Mac version

WebStorm Mac version

Useful JavaScript development tools

ZendStudio 13.5.1 Mac

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment

Safe Exam Browser

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

MinGW - Minimalist GNU for Windows

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.