Home > Article > Technology peripherals > The facial features are flying around, opening the mouth, staring, and raising eyebrows, AI can imitate them perfectly, making it impossible to prevent video scams
Such a powerful AI imitation ability is really unstoppable, completely unstoppable. Has the development of AI reached this level now?
You let your facial features fly around with your front foot, and the exact same expression is reproduced with your back foot. Staring, raising eyebrows, pouting, no matter how exaggerated the expression is, it is all imitated very well. in place.
Increase the difficulty, raise your eyebrows higher, open your eyes wider, and even the mouth shape is crooked, virtual Character avatars can also perfectly reproduce expressions.
When you adjust the parameters on the left side, the virtual avatar on the right side will also change its movements accordingly
Give a close-up of the mouth and eyes. I cannot say that the imitation is exactly the same, I can only say that the expression is exactly the same (far right).
This research comes from institutions such as the Technical University of Munich, who proposed GaussianAvatars, a method that can be used to create expressions, poses and viewpoints (viewpoints) ), fully controllable, realistic head avatars.
In the field of computer vision and graphics, Creating a virtual head that dynamically represents a human has always been a challenging problem. Especially in terms of expressing extreme facial expressions and details, it is quite difficult to capture details such as wrinkles and hair, and the generated virtual characters often suffer from visual artifacts
In the past period Over time, Neural Radiation Fields (NeRF) and its variants have achieved impressive results in reconstructing static scenes from multi-view observations. Subsequent research extended these methods, enabling NeRF to be used for dynamic scene modeling of human-tailored scenarios. However, a drawback of these methods is the lack of controllability and thus the inability to adapt well to new poses and expressions
The recently emerged “3D Gaussian Spraying” method achieves better performance than NeRF High rendering quality for real-time view compositing. However, this method does not support animation of the reconstructed output
This paper proposes GaussianAvatars, a dynamic 3D human head representation method based on three-dimensional Gaussian splats.
Specifically, given a FLAME (modeling the entire head) mesh, they initialized a 3D Gaussian at the center of each triangle. When a FLAME mesh is animated, each Gaussian model is translated, rotated, and scaled based on its parent triangle. The 3D Gaussian then forms a radiation field on top of the mesh, compensating for areas where the mesh is not accurately aligned or fails to reproduce certain visual elements.
In order to maintain a high degree of realism of virtual characters, this article adopts a binding inheritance strategy. At the same time, this paper also studies how to strike a balance between maintaining realism and stability to animate novel expressions and postures of virtual characters. The research results show that compared with existing research, GaussianAvatars performs well in novel view rendering and driving video reproduction
As shown in Figure 2 below, the input to GaussianAvatars is a multi-view video record of the human head. For each time step, GaussianAvatars uses a photometric head tracker to match FLAME parameters to multi-view observations and known camera parameters.
#FLAME The mesh’s vertex locations varied but the topology was the same, allowing the research team to create consistent connections between mesh triangles and 3D Gaussian splats. Render splat into an image using a differentiable tile rasterizer. Then, with real image supervision, realistic human head avatars are learned
#To obtain the best quality, static scenes need to be compacted and pruned by Gaussian splats through a set of adaptive density control operations. To achieve this, the research team designed a binding inheritance strategy that keeps new Gaussian points bound to the FLAME mesh without destroying the connection between the triangle and the splat
This study uses new perspective synthesis techniques to evaluate reconstruction quality and evaluate animation fidelity through self-reproduction. Figure 3 below shows the results of a qualitative comparison between different methods. In terms of new perspective synthesis, all methods are able to produce reasonable rendering results. However, upon closer inspection of PointAvatar's results, it can be seen that point artifacts occur due to its fixed point size. GaussianAvatars using 3D Gaussian anisotropic scaling technology can alleviate this problem
We can draw similar results from the quantitative comparison in Table 1 in conclusion. Compared with other methods, GaussianAvatars performs well in new view synthesis, is also excellent in self-reenactment, and has significantly reduced perceptual differences in LPIPS. It should be noted that self-reenactment is based on FLAME grid tracking and may not be fully aligned with the target image
In order to test the avatar animation in reality Performance in the world, this study conducted a cross-identity reproduction experiment in Figure 4. The results showed that the avatar accurately reproduced the source actor's blinking and mouth movements, showing lively and complex dynamics such as wrinkles.
##In order to verify the effectiveness of the method components, the study also conducted an ablation experiment, and the results are as follows.
The above is the detailed content of The facial features are flying around, opening the mouth, staring, and raising eyebrows, AI can imitate them perfectly, making it impossible to prevent video scams. For more information, please follow other related articles on the PHP Chinese website!