


Zhejiang University proposes new SOTA technology SIFU: only one picture can reconstruct high-quality 3D human body model
In many fields such as AR, VR, 3D printing, scene construction, and film production, high-quality 3D models of the human body wearing clothes are very important.
Traditional methods to create models require a lot of time and can only be completed by professional equipment and technical personnel.
On the contrary, in daily life, we usually use mobile phone cameras or Portrait photos found on the web.
Therefore, a method that can accurately reconstruct a 3D human model from a single image can significantly reduce costs and simplify the independent creation process.
Comparison of the technical route of previous methods (left) and this method (right)
Previous depth Learning models for 3D human body reconstruction often require three steps: extracting 2D features from images, transferring 2D features to 3D space, and using 3D features for human body reconstruction.
However, these methods often ignore the introduction of human body priors in the stage of converting 2D features into 3D space, resulting in insufficient feature extraction and various defects in the final reconstruction results. .
Comparison of the reconstruction effect of SIFU and other SOTA models
In addition, in the stage of texture prediction, In the past, models only relied on the knowledge learned in the training set and lacked prior knowledge of the real world, which often resulted in poor texture prediction in invisible areas.
SIFU introduces prior knowledge in the texture prediction stage to enhance the texture effect of invisible areas (back, etc.).
In this regard, researchers from Zhejiang University's ReLER Laboratory proposed the SIFU model, which relies on side view conditional implicit functions to reconstruct a 3D human body model from a single image.
Picture
Paper address: https://arxiv.org/abs/2312.06704
Project address : https://github.com/River-Zhang/SIFU
This model enhances the geometric reconstruction effect by introducing the side view of the human body as a priori condition by converting 2D features into 3D space. And a pre-trained diffusion model is introduced in the texture optimization stage to solve the problem of poor texture in invisible areas.
Model structure
The model pipeline is as follows:
Pictures
The model operation can be divided into two stages. The first stage uses the side implicit function to reconstruct the geometry (mesh) and rough texture (coarse texture) of the human body. The second stage uses the pre-trained Diffusion models refine textures.
In the first stage, the author designed a unique Side-view Decoupling Transformer. After extracting 2D features through the global encoder, the human body prior model SMPL- was introduced in the decoder. The side view of
This method successfully combines prior knowledge of the human body when converting 2D features into 3D space, resulting in a better reconstruction effect of the model.
In the second stage, the author proposes a 3D Consistent Texture Refinement process. First, the invisible areas of the human body (sides and backs) can be differentiated into A collection of pictures with continuous viewing angles, and then with the help of a diffusion model that learns prior knowledge from massive data, the rough texture pictures can be edited consistently to obtain more refined results. Finally, the texture map of the 3D model is optimized by calculating the loss from the images before and after refinement.
Experimental part
Higher reconstruction accuracy
In the experimental part, the author uses comprehensive Their models were tested on diverse test sets, including CAPE-NFP, CAPE-FP and THuman2.0, and compared with previous single-image human reconstruction SOTA models published at major conferences. After quantitative testing, the SIFU model showed the best results in both geometric reconstruction and texture reconstruction.
Quantitative evaluation of geometric reconstruction accuracy
Quantitative evaluation of texture reconstruction effect
Use public pictures on the Internet as input to demonstrate qualitative effects
Stronger robustness
Previous When the model is applied to data other than the training set, because the estimated human body prior model SMPL/SMPL-X is not accurate enough, the reconstruction results are often far different from the input image, making it difficult to put it into practical application.
In this regard, the author specifically tested the robustness of the model. By adding perturbations to the ground truth prior model parameters, the pose was shifted to simulate the real scene. SMPL-X estimates inaccurate situations to evaluate the accuracy of model reconstruction. The results show that the SIFU model still has the best reconstruction accuracy in this case.
Evaluate the robustness of the model when facing a human body prior model with errors
Using real-world pictures, SIFU still has a better reconstruction effect when the prior human body model estimation is inaccurate
Broader Application scenarios
The high-precision and high-quality reconstruction effect of the SIFU model makes it suitable for a variety of application scenarios, including 3D printing, scene construction, texture editing, etc.
3D printed SIFU reconstructed human body model
##SIFU is used for 3D scene construction
##With the help of public action sequence data, the model reconstructed by SIFU can be drivenSummary
This article proposes a side view conditional implicit function and a 3D consistent texture editing method to make up for the It overcomes the shortcomings of prior knowledge introduced in previous work when converting 2D features to 3D space and texture prediction, greatly improving the accuracy and effect of human body reconstruction in a single picture, giving the model significant advantages in real-world applications, and also It provides new ideas for future research in this field.
Reference:
https://arxiv.org/abs/2312.06704
The above is the detailed content of Zhejiang University proposes new SOTA technology SIFU: only one picture can reconstruct high-quality 3D human body model. For more information, please follow other related articles on the PHP Chinese website!

Meta has joined hands with partners such as Nvidia, IBM and Dell to expand the enterprise-level deployment integration of Llama Stack. In terms of security, Meta has launched new tools such as Llama Guard 4, LlamaFirewall and CyberSecEval 4, and launched the Llama Defenders program to enhance AI security. In addition, Meta has distributed $1.5 million in Llama Impact Grants to 10 global institutions, including startups working to improve public services, health care and education. The new Meta AI application powered by Llama 4, conceived as Meta AI

Joi AI, a company pioneering human-AI interaction, has introduced the term "AI-lationships" to describe these evolving relationships. Jaime Bronstein, a relationship therapist at Joi AI, clarifies that these aren't meant to replace human c

Online fraud and bot attacks pose a significant challenge for businesses. Retailers fight bots hoarding products, banks battle account takeovers, and social media platforms struggle with impersonators. The rise of AI exacerbates this problem, rende

AI agents are poised to revolutionize marketing, potentially surpassing the impact of previous technological shifts. These agents, representing a significant advancement in generative AI, not only process information like ChatGPT but also take actio

AI's Impact on Crucial NBA Game 4 Decisions Two pivotal Game 4 NBA matchups showcased the game-changing role of AI in officiating. In the first, Denver's Nikola Jokic's missed three-pointer led to a last-second alley-oop by Aaron Gordon. Sony's Haw

Traditionally, expanding regenerative medicine expertise globally demanded extensive travel, hands-on training, and years of mentorship. Now, AI is transforming this landscape, overcoming geographical limitations and accelerating progress through en

Intel is working to return its manufacturing process to the leading position, while trying to attract fab semiconductor customers to make chips at its fabs. To this end, Intel must build more trust in the industry, not only to prove the competitiveness of its processes, but also to demonstrate that partners can manufacture chips in a familiar and mature workflow, consistent and highly reliable manner. Everything I hear today makes me believe Intel is moving towards this goal. The keynote speech of the new CEO Tan Libo kicked off the day. Tan Libai is straightforward and concise. He outlines several challenges in Intel’s foundry services and the measures companies have taken to address these challenges and plan a successful route for Intel’s foundry services in the future. Tan Libai talked about the process of Intel's OEM service being implemented to make customers more

Addressing the growing concerns surrounding AI risks, Chaucer Group, a global specialty reinsurance firm, and Armilla AI have joined forces to introduce a novel third-party liability (TPL) insurance product. This policy safeguards businesses against


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

SublimeText3 Linux new version
SublimeText3 Linux latest version

SublimeText3 Chinese version
Chinese version, very easy to use

Dreamweaver CS6
Visual web development tools

EditPlus Chinese cracked version
Small size, syntax highlighting, does not support code prompt function

WebStorm Mac version
Useful JavaScript development tools
