search
HomeTechnology peripheralsAIProduced by Peking University: The latest SOTA with texture quality and multi-view consistency, achieving 3D conversion of one image in 2 minutes

It only takes two minutes to convert pictures into 3D!

It is still the kind with high texture quality and high consistency in multiple viewing angles.

Produced by Peking University: The latest SOTA with texture quality and multi-view consistency, achieving 3D conversion of one image in 2 minutes

No matter what species it is, the single-view image when input is still like this:

Produced by Peking University: The latest SOTA with texture quality and multi-view consistency, achieving 3D conversion of one image in 2 minutes

Two minutes later , the 3D version is done:

Produced by Peking University: The latest SOTA with texture quality and multi-view consistency, achieving 3D conversion of one image in 2 minutes
##△Top, Repaint123 (

NeRF); Bottom, Repaint123 (GS)

The new method is called

Repaint123. The core idea is to combine the powerful image generation capability of the 2D diffusion model with the texture alignment capability of the repaint strategy to generate high-quality, consistent images from multiple perspectives.

In addition, this research also introduces a visibility-aware adaptive repaint intensity method for overlapping areas.

Repaint123 solves the problems of previous methods such as large multi-view deviation, texture degradation, and slow generation in one fell swoop.

Produced by Peking University: The latest SOTA with texture quality and multi-view consistency, achieving 3D conversion of one image in 2 minutes

The project code has not yet been published on GitHub, but 100 people have come to mark the code:

Produced by Peking University: The latest SOTA with texture quality and multi-view consistency, achieving 3D conversion of one image in 2 minutes

What does Repaint123 look like?

Previously, the method of converting images to 3D usually used Score Distillation Sampling (SDS). Although the results of this method are impressive, there are some issues such as multi-view inconsistency, over-saturation, over-smoothed textures, and slow generation.

Produced by Peking University: The latest SOTA with texture quality and multi-view consistency, achieving 3D conversion of one image in 2 minutes
△From top to bottom: input, Zero123-XL, Magic123, Dream gaussian

In order to solve these problems, from Peking University and Pengcheng Laboratory Researchers from , National University of Singapore, and Wuhan University proposed Repaint123.

Produced by Peking University: The latest SOTA with texture quality and multi-view consistency, achieving 3D conversion of one image in 2 minutes

In general, Repaint123 has the following contributions:

(1) Repaint123 generates a controllable redrawing process from images to 3D by comprehensively considering it , able to generate high-quality image sequences and ensure that these images are consistent across multiple viewing angles.

(2) Repaint123 proposed a simple baseline method for single-view 3D generation.

In the rough model stage, it uses Zero123 as the 3D prior, combined with the SDS loss function, to quickly generate a rough 3D model (only 1 minute) by optimizing the Gaussian Splatting geometry.

In the fine model stage, it uses Stable Diffusion as the 2D prior, combined with the mean square error (MSE) loss function, to generate high-quality 3D models by quickly refining the mesh texture (also only 1 minute).

(3) A large number of experiments have proven the effectiveness of the Repaint123 method. It is able to generate high-quality 3D content that matches 2D generation quality from a single image in just 2 minutes.

Produced by Peking University: The latest SOTA with texture quality and multi-view consistency, achieving 3D conversion of one image in 2 minutes
△Achieve 3D consistent and high-quality single-view 3D rapid generation

Let’s look at the specific methods.

Repaint123 focuses on optimizing the mesh refinement stage, and its main improvement directions cover two aspects: generating high-quality image sequences with multi-view consistency and achieving fast and high-quality 3D reconstruction.

1. Generating a high-quality image sequence with multi-view consistency

Generating a high-quality image sequence with multi-view consistency is divided into the following three parts:

Produced by Peking University: The latest SOTA with texture quality and multi-view consistency, achieving 3D conversion of one image in 2 minutes
△Consistent image generation process from multiple perspectives

DDIM inversion

In order to retain the generation in the rough model stage To obtain consistent 3D low-frequency texture information, the author uses DDIM inversion to invert the image into a determined latent space, laying the foundation for the subsequent denoising process and generating faithful and consistent images.

Controllable denoising

In order to control the geometric consistency and long-range texture consistency in the denoising stage, the author introduced ControlNet, using the depth map rendered by the coarse model as a geometric prior, and at the same time injecting the Attention feature of the reference map for texture migration.

In addition, in order to perform classifier-free guidance to improve image quality, the paper uses CLIP to encode reference images into image cues for guiding the denoising network.

Redraw

Progressive redrawing of occlusions and overlapping portions To ensure that overlapping areas of adjacent images in an image sequence are aligned at the pixel level, the author uses progressive local Redraw strategy.

While keeping overlapping areas unchanged, harmonious adjacent areas are generated and gradually extend to 360° from the reference perspective.

However, as shown in the figure below, the author found that the overlapping area also needs to be refined, because the visual resolution of the previously strabismused area becomes larger during emmetropia, and more high-frequency information needs to be added.

In addition, the thinning intensity is equal to 1-cosθ*, where θ* is the maximum value of the angle θ between all previous camera angles and the normal vector of the viewed surface, Thereby adaptively redrawing overlapping areas.

Produced by Peking University: The latest SOTA with texture quality and multi-view consistency, achieving 3D conversion of one image in 2 minutes

△The relationship between camera angle and thinning intensity

In order to choose the appropriate thinning intensity to ensure fidelity while improving quality, the author draws lessons from Based on the projection theorem and the idea of ​​image super-resolution, a simple and direct visibility-aware redrawing strategy is proposed to refine the overlapping areas.

2. Fast and high-quality 3D reconstruction

As shown in the figure below, the author uses two methods in the process of fast and high-quality 3D reconstruction. stage approach.

Produced by Peking University: The latest SOTA with texture quality and multi-view consistency, achieving 3D conversion of one image in 2 minutes

△Repaint123 two-stage single-view 3D generation framework

First, they utilize Gaussian Splatting representation to quickly generate reasonable geometric structures and rough textures.

At the same time, with the help of the previously generated multi-view consistent high-quality image sequence, the author is able to use a simple mean square error (MSE) loss for fast 3D texture reconstruction.

Optimum for Consistency, Quality and Speed

Researchers compared multiple approaches for single-view generation tasks.

Produced by Peking University: The latest SOTA with texture quality and multi-view consistency, achieving 3D conversion of one image in 2 minutes

△Single-view 3D generation visualization comparison

On RealFusion15 and Test-alpha data sets, Repaint123 achieved three results in consistency, quality and speed. The most advanced effect in terms of performance.

Produced by Peking University: The latest SOTA with texture quality and multi-view consistency, achieving 3D conversion of one image in 2 minutes

At the same time, the author also conducted ablation experiments on the effectiveness of each module used in the paper and the increment of perspective rotation:

Produced by Peking University: The latest SOTA with texture quality and multi-view consistency, achieving 3D conversion of one image in 2 minutes
Produced by Peking University: The latest SOTA with texture quality and multi-view consistency, achieving 3D conversion of one image in 2 minutes

It was also found that when the viewing angle interval is 60 degrees, the performance reaches the peak, but an excessive viewing angle interval will reduce the overlapping area and increase the possibility of multi-faceted problems, so 40 degrees can be used as the optimal viewing angle interval.

Produced by Peking University: The latest SOTA with texture quality and multi-view consistency, achieving 3D conversion of one image in 2 minutes

Paper address: https://arxiv.org/pdf/2312.13271.pdf
Code address: https:// pku-yuangroup.github.io/repaint123/
Project address: https://pku-yuangroup.github.io/repaint123/

The above is the detailed content of Produced by Peking University: The latest SOTA with texture quality and multi-view consistency, achieving 3D conversion of one image in 2 minutes. For more information, please follow other related articles on the PHP Chinese website!

Statement
This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete
如何在 Windows 11 中清除桌面背景最近的图像历史记录如何在 Windows 11 中清除桌面背景最近的图像历史记录Apr 14, 2023 pm 01:37 PM

<p>Windows 11 改进了系统中的个性化功能,这使用户可以查看之前所做的桌面背景更改的近期历史记录。当您进入windows系统设置应用程序中的个性化部分时,您可以看到各种选项,更改背景壁纸也是其中之一。但是现在可以看到您系统上设置的背景壁纸的最新历史。如果您不喜欢看到此内容并想清除或删除此最近的历史记录,请继续阅读这篇文章,它将帮助您详细了解如何使用注册表编辑器进行操作。</p><h2>如何使用注册表编辑

如何在电脑上下载 Windows 聚光灯壁纸图像如何在电脑上下载 Windows 聚光灯壁纸图像Aug 23, 2023 pm 02:06 PM

窗户从来不是一个忽视美学的人。从XP的田园绿场到Windows11的蓝色漩涡设计,默认桌面壁纸多年来一直是用户愉悦的源泉。借助WindowsSpotlight,您现在每天都可以直接访问锁屏和桌面壁纸的美丽、令人敬畏的图像。不幸的是,这些图像并没有闲逛。如果您爱上了Windows聚光灯图像之一,那么您将想知道如何下载它们,以便将它们作为背景保留一段时间。以下是您需要了解的所有信息。什么是WindowsSpotlight?窗口聚光灯是一个自动壁纸更新程序,可以从“设置”应用中的“个性化&gt

如何在Python中使用图像语义分割技术?如何在Python中使用图像语义分割技术?Jun 06, 2023 am 08:03 AM

随着人工智能技术的不断发展,图像语义分割技术已经成为图像分析领域的热门研究方向。在图像语义分割中,我们将一张图像中的不同区域进行分割,并对每个区域进行分类,从而达到对这张图像的全面理解。Python是一种著名的编程语言,其强大的数据分析和数据可视化能力使其成为了人工智能技术研究领域的首选。本文将介绍如何在Python中使用图像语义分割技术。一、前置知识在深入

2D图像脑补3D人体,衣服随便搭,还能改动作2D图像脑补3D人体,衣服随便搭,还能改动作Apr 11, 2023 pm 02:31 PM

得益于 NeRF 提供的可微渲染,近期的三维生成模型已经在静止物体上达到了很惊艳的效果。但是在人体这种更加复杂且可形变的类别上,三维生成依旧有很大的挑战。本文提出了一个高效的组合的人体 NeRF 表达,实现了高分辨率(512x256)的三维人体生成,并且没有使用超分模型。EVA3D 在四个大型人体数据集上均大幅超越了已有方案,代码已开源。论文名称:EVA3D: Compositional 3D Human Generation from 2D image Collections论文地址:http

如何在Windows上使用PowerToys批量调整图像大小如何在Windows上使用PowerToys批量调整图像大小Aug 23, 2023 pm 07:49 PM

那些必须每天处理图像文件的人经常不得不调整它们的大小以适应他们的项目和工作的需求。但是,如果要处理的图像太多,则单独调整它们的大小会消耗大量时间和精力。在这种情况下,像PowerToys这样的工具可以派上用场,除其他外,可以使用其图像调整大小器实用程序批量调整图像文件的大小。以下是设置图像调整器设置并开始使用PowerToys批量调整图像大小的方法。如何使用PowerToys批量调整图像大小PowerToys是一个多合一的程序,具有各种实用程序和功能,可帮助您加快日常任务。它的实用程序之一是图像

新视角图像生成:讨论基于NeRF的泛化方法新视角图像生成:讨论基于NeRF的泛化方法Apr 09, 2023 pm 05:31 PM

新视角图像生成(NVS)是计算机视觉的一个应用领域,在1998年SuperBowl的比赛,CMU的RI曾展示过给定多摄像头立体视觉(MVS)的NVS,当时这个技术曾转让给美国一家体育电视台,但最终没有商业化;英国BBC广播公司为此做过研发投入,但是没有真正产品化。在基于图像渲染(IBR)领域,NVS应用有一个分支,即基于深度图像的渲染(DBIR)。另外,在2010年曾很火的3D TV,也是需要从单目视频中得到双目立体,但是由于技术的不成熟,最终没有流行起来。当时基于机器学习的方法已经开始研究,比

如何使用Python对图片进行图像去噪处理如何使用Python对图片进行图像去噪处理Aug 18, 2023 am 09:48 AM

如何使用Python对图片进行图像去噪处理图像去噪是图像处理中的一项重要任务,它的目的是去除图像中的噪声,提高图像的质量和清晰度。Python是一种功能强大的编程语言,拥有丰富的图像处理库,如PIL、OpenCV等,可以帮助我们实现图像去噪的功能。本文将介绍如何使用Python对图片进行图像去噪处理,并给出相应的代码示例。导入所需的库首先,我们需要导入所需的

无需下游训练,Tip-Adapter大幅提升CLIP图像分类准确率无需下游训练,Tip-Adapter大幅提升CLIP图像分类准确率Apr 12, 2023 pm 03:25 PM

论文链接:https://arxiv.org/pdf/2207.09519.pdf代码链接:https://github.com/gaopengcuhk/Tip-Adapter一.研究背景对比性图像语言预训练模型(CLIP)在近期展现出了强大的视觉领域迁移能力,可以在一个全新的下游数据集上进行 zero-shot 图像识别。为了进一步提升 CLIP 的迁移性能,现有方法使用了 few-shot 的设置,例如 CoOp 和 CLIP-Adapter,即提供了少量下游数据集的训练数据,使得 CLIP

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
Repo: How To Revive Teammates
1 months agoBy尊渡假赌尊渡假赌尊渡假赌
Hello Kitty Island Adventure: How To Get Giant Seeds
1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

EditPlus Chinese cracked version

EditPlus Chinese cracked version

Small size, syntax highlighting, does not support code prompt function

Dreamweaver Mac version

Dreamweaver Mac version

Visual web development tools

Atom editor mac version download

Atom editor mac version download

The most popular open source editor

mPDF

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),