Ali innovates again: you can realize the dance of 'Cleaning the Glass' with a sentence and a human face, and the costume and background can be switched freely!-AI-php.cn

Ali innovates again: you can realize the dance of 'Cleaning the Glass' with a sentence and a human face, and the costume and background can be switched freely!

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Dec 15, 2023 pm 12:39 PM

projectpromptt2v

Another Alibaba paper called "Dance Whole Job" caused a sensation after AnimateAnyone

Now, just upload a photo of your face and describe it with a simple sentence, you can be anywhere Let’s dance!

For example, the dance video of "Cleaning the Glass" below:

Ali innovates again: you can realize the dance of Cleaning the Glass with a sentence and a human face, and the costume and background can be switched freely! Picture

All you need to do is upload a portrait photo , and fill in the corresponding prompt information

In the golden leaves of autumn, a girl is smiling and dancing in a light blue dress

As the prompts change, the background and clothes of the character will also Change accordingly. For example, we can change a few more sentences:

A girl is smiling and dancing in a wooden house. She is wearing a sweater and trousers

A girl is smiling and dancing in Times Square, Wearing a dress-like white shirt, long sleeves, and long pants.

Ali innovates again: you can realize the dance of Cleaning the Glass with a sentence and a human face, and the costume and background can be switched freely! Picture

This is Ali's latest research - DreaMoving, which focuses on letting anyone dance at any time and anywhere.

Ali innovates again: you can realize the dance of Cleaning the Glass with a sentence and a human face, and the costume and background can be switched freely! Pictures

And not only real people, but also cartoon and animation characters can be held~

Ali innovates again: you can realize the dance of Cleaning the Glass with a sentence and a human face, and the costume and background can be switched freely! Picture

As soon as the project came out, it also attracted the attention of many netizens. Some people called "Unbelievable" after seeing the effect~

Ali innovates again: you can realize the dance of Cleaning the Glass with a sentence and a human face, and the costume and background can be switched freely! Picture

So how is this result achieved? How was this research conducted?

Principle behind

Although the advent of text-to-video (T2V) models such as Stable Video Diffusion and Gen2, has made great progress in the field of video generation A major breakthrough, but there are still many challenges

For example, in terms of data sets, there is currently a lack of open source human dance video data sets and difficulty in obtaining corresponding precise text descriptions, which makes it difficult for models to generate diverse Sexuality, frame consistency, and longer videos have become challenges

And in the field of human-centered content generation, the personalization and controllability of the generated results are also key factors.

Ali innovates again: you can realize the dance of Cleaning the Glass with a sentence and a human face, and the costume and background can be switched freely! Picture

In order to deal with these two challenges, the Alibaba team first started to process the data set

The researchers first collected it from the Internet About 1000 high quality human dance videos. Then, they cut these videos into about 6,000 short videos (8 to 10 seconds each) to ensure that there are no transitions and special effects in the video clips, which is conducive to the training of the temporal model

In addition, in order to generate For the text description of the video, they used Minigpt-v2 as the video captioner (video captioner), specifically the "grounding" version. The instruction is to describe the frame in detail.

By generating subtitles based on the key frame center frame, the theme and background content of the video clip can be accurately described

In terms of framework, the Alibaba team proposed a tool called DreaMoving based on Stable Diffusion model.

It is mainly composed of three neural networks, including Denoising U-Net (Denoising U-Net), Video Control Network (Video ControlNet) and Content Guider (Content Guider).

Ali innovates again: you can realize the dance of Cleaning the Glass with a sentence and a human face, and the costume and background can be switched freely! picture

Among them, Video ControlNet is an image control network injected into the Motion Block after each U-Net block, processing the control sequence (pose or depth) into an additional temporal residual

Denoising U-Net is A derived Stable-Diffusion U-Net with motion blocks for video generation.

The Content Guider transmits the input text prompts and appearance expressions (such as faces) to the content embedding.

Through such operations, DreaMoving is able to generate high-quality, high-fidelity videos given the input of a guidance sequence and a simple content description (such as text and reference images)

Ali innovates again: you can realize the dance of Cleaning the Glass with a sentence and a human face, and the costume and background can be switched freely! Picture

But unfortunately, there is currently no open source code for the DreaMoving project.

For those who are interested in this, you can pay attention first and wait for the release of the open source code~

Please refer to the following link: [1]https://dreamoving.github.io/dreamoving /[2]https://arxiv.org/abs/2312.05107[3]https://twitter.com/ProperPrompter/status/1734192772465258499[4]https://github.com/dreamoving/dreamoving-project

The above is the detailed content of Ali innovates again: you can realize the dance of 'Cleaning the Glass' with a sentence and a human face, and the costume and background can be switched freely!. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete

The Hidden Dangers Of AI Internal Deployment: Governance Gaps And Catastrophic RisksApr 28, 2025 am 11:12 AM

The unchecked internal deployment of advanced AI systems poses significant risks, according to a new report from Apollo Research. This lack of oversight, prevalent among major AI firms, allows for potential catastrophic outcomes, ranging from uncont

Building The AI PolygraphApr 28, 2025 am 11:11 AM

Traditional lie detectors are outdated. Relying on the pointer connected by the wristband, a lie detector that prints out the subject's vital signs and physical reactions is not accurate in identifying lies. This is why lie detection results are not usually adopted by the court, although it has led to many innocent people being jailed. In contrast, artificial intelligence is a powerful data engine, and its working principle is to observe all aspects. This means that scientists can apply artificial intelligence to applications seeking truth through a variety of ways. One approach is to analyze the vital sign responses of the person being interrogated like a lie detector, but with a more detailed and precise comparative analysis. Another approach is to use linguistic markup to analyze what people actually say and use logic and reasoning. As the saying goes, one lie breeds another lie, and eventually

Is AI Cleared For Takeoff In The Aerospace Industry?Apr 28, 2025 am 11:10 AM

The aerospace industry, a pioneer of innovation, is leveraging AI to tackle its most intricate challenges. Modern aviation's increasing complexity necessitates AI's automation and real-time intelligence capabilities for enhanced safety, reduced oper

Watching Beijing's Spring Robot RaceApr 28, 2025 am 11:09 AM

The rapid development of robotics has brought us a fascinating case study. The N2 robot from Noetix weighs over 40 pounds and is 3 feet tall and is said to be able to backflip. Unitree's G1 robot weighs about twice the size of the N2 and is about 4 feet tall. There are also many smaller humanoid robots participating in the competition, and there is even a robot that is driven forward by a fan. Data interpretation The half marathon attracted more than 12,000 spectators, but only 21 humanoid robots participated. Although the government pointed out that the participating robots conducted "intensive training" before the competition, not all robots completed the entire competition. Champion - Tiangong Ult developed by Beijing Humanoid Robot Innovation Center

The Mirror Trap: AI Ethics And The Collapse Of Human ImaginationApr 28, 2025 am 11:08 AM

Artificial intelligence, in its current form, isn't truly intelligent; it's adept at mimicking and refining existing data. We're not creating artificial intelligence, but rather artificial inference—machines that process information, while humans su

New Google Leak Reveals Handy Google Photos Feature UpdateApr 28, 2025 am 11:07 AM

A report found that an updated interface was hidden in the code for Google Photos Android version 7.26, and each time you view a photo, a row of newly detected face thumbnails are displayed at the bottom of the screen. The new facial thumbnails are missing name tags, so I suspect you need to click on them individually to see more information about each detected person. For now, this feature provides no information other than those people that Google Photos has found in your images. This feature is not available yet, so we don't know how Google will use it accurately. Google can use thumbnails to speed up finding more photos of selected people, or may be used for other purposes, such as selecting the individual to edit. Let's wait and see. As for now

Guide to Reinforcement Finetuning - Analytics VidhyaApr 28, 2025 am 09:30 AM

Reinforcement finetuning has shaken up AI development by teaching models to adjust based on human feedback. It blends supervised learning foundations with reward-based updates to make them safer, more accurate, and genuinely help

Let's Dance: Structured Movement To Fine-Tune Our Human Neural NetsApr 27, 2025 am 11:09 AM

Scientists have extensively studied human and simpler neural networks (like those in C. elegans) to understand their functionality. However, a crucial question arises: how do we adapt our own neural networks to work effectively alongside novel AI s

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

What's New in Windows 11 KB5054979 & How to Fix Update Issues

3 weeks agoByDDD

How to fix KB5055523 fails to install in Windows 11?

2 weeks agoByDDD

InZoi: How To Apply To School And University

3 weeks agoByDDD

How to fix KB5055518 fails to install in Windows 10?

2 weeks agoByDDD

Roblox: Dead Rails – How To Summon And Defeat Nikola Tesla

4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

MantisBT

Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

EditPlus Chinese cracked version

Small size, syntax highlighting, does not support code prompt function

SublimeText3 Chinese version

Chinese version, very easy to use

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

Hot Topics

Where is the login entrance for gmail email?

7801

1644

1402

1299

1236