Using machine learning to reconstruct faces in videos
Translator|Cui Hao
Reviser|Sun Shujuan
Opening Chapter
The workflow of the new system will take into account occlusion situations, such as when an object moves away from the line of sight. This is also one of the biggest challenges for deepfake software, as FAN landmarks can barely account for these situations and their translation quality tends to degrade as faces are avoided or occluded.
The new system avoids the above problems by defining "contour energy" that matches the boundaries of 3D faces (3DMM) and 2D faces (defined by FAN landmarks).
Optimization
The application scenario of this system is real-time deformation, such as real-time transformation of face shape in a video chat filter. Currently, frameworks cannot achieve this, so providing the necessary computing resources to enable "real-time" deformation becomes a significant challenge.
According to the assumptions of the paper, the latency of each frame operation of the 24fps video relative to the material per second in the pipeline is 16.344 seconds. At the same time, for feature estimation and 3D facial deformation, it is also accompanied by one hit (respectively 321 ms and 160 ms).
As a result, optimization has made key progress in reducing latency. Since joint optimization across all frames would significantly increase system overhead, and optimization of the initialization style (assuming consistent speaker characteristics throughout) may lead to anomalies, the authors adopted a sparse mode to calculate coefficients at realistic intervals of sampled frames.
Joint optimization is then performed on this subset of frames, resulting in a leaner reconstruction process.
Facial Surface
The morphing technology used in this project is an adaptation of the author’s 2020 work Deep Shapely Portraits (DSP).
Deep Shapely Portraits, 2020 submission to ACM Multimedia. The paper was led by researchers from the Zhejiang University-Tencent Joint Laboratory for Game and Intelligent Graphics Innovation Technology
The authors observed that “we extend this method from reshaping a single image to reshaping an entire image sequence.”
Testing
The paper points out that there is no comparable historical data to evaluate the new method. Therefore, the authors compared their curved video output frames with static DSP output.
Testing the new system against static images from Deep Shapely Portraits
The author pointed out that due to the use of sparse mapping, the DSP method will have traces of artificial modification— —The new framework solves this problem through dense mapping. Furthermore, the paper argues that videos produced by DSP lack smoothness and visual coherence.
The authors pointed out:
“The results show that our method can stably and coherently generate reshaped portrait videos, while image-based methods can easily lead to obvious flickering artifacts (artificial Traces of modification)."
Translator introduction
Cui Hao, 51CTO community editor, senior architect, has 18 years of software development and architecture experience, and 10 years of distributed architecture experience. Formerly a technical expert at HP. He is willing to share and has written many popular technical articles with more than 600,000 reads. Author of "Principles and Practice of Distributed Architecture".
Original title: Restructuring Faces in Videos With Machine Learning, author: Martin Anderson
The above is the detailed content of Using machine learning to reconstruct faces in videos. For more information, please follow other related articles on the PHP Chinese website!

Harness the Power of On-Device AI: Building a Personal Chatbot CLI In the recent past, the concept of a personal AI assistant seemed like science fiction. Imagine Alex, a tech enthusiast, dreaming of a smart, local AI companion—one that doesn't rely

Their inaugural launch of AI4MH took place on April 15, 2025, and luminary Dr. Tom Insel, M.D., famed psychiatrist and neuroscientist, served as the kick-off speaker. Dr. Insel is renowned for his outstanding work in mental health research and techno

"We want to ensure that the WNBA remains a space where everyone, players, fans and corporate partners, feel safe, valued and empowered," Engelbert stated, addressing what has become one of women's sports' most damaging challenges. The anno

Introduction Python excels as a programming language, particularly in data science and generative AI. Efficient data manipulation (storage, management, and access) is crucial when dealing with large datasets. We've previously covered numbers and st

Before diving in, an important caveat: AI performance is non-deterministic and highly use-case specific. In simpler terms, Your Mileage May Vary. Don't take this (or any other) article as the final word—instead, test these models on your own scenario

Building a Standout AI/ML Portfolio: A Guide for Beginners and Professionals Creating a compelling portfolio is crucial for securing roles in artificial intelligence (AI) and machine learning (ML). This guide provides advice for building a portfolio

The result? Burnout, inefficiency, and a widening gap between detection and action. None of this should come as a shock to anyone who works in cybersecurity. The promise of agentic AI has emerged as a potential turning point, though. This new class

Immediate Impact versus Long-Term Partnership? Two weeks ago OpenAI stepped forward with a powerful short-term offer, granting U.S. and Canadian college students free access to ChatGPT Plus through the end of May 2025. This tool includes GPT‑4o, an a


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

ZendStudio 13.5.1 Mac
Powerful PHP integrated development environment

Notepad++7.3.1
Easy-to-use and free code editor

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

EditPlus Chinese cracked version
Small size, syntax highlighting, does not support code prompt function

Dreamweaver CS6
Visual web development tools