Home  >  Article  >  Technology peripherals  >  Can video generation be infinitely long? Google VideoPoet large model is online, netizens: revolutionary technology

Can video generation be infinitely long? Google VideoPoet large model is online, netizens: revolutionary technology

WBOY
WBOYforward
2024-03-01 20:13:36504browse

Mona Lisa yawns, a chicken learns to lift an iron... Google VideoPoet large model performs very well.


At the end of 2023, technology companies are impacting the last level of generative AI-video generation.

#On Tuesday, the large video generation model proposed by Google went online and immediately attracted people's attention. This large language model called VideoPoet is considered a revolutionary zero-shot video generation tool.

VideoPoet can generate videos from texts and images, as well as style transfer and video to speech. In effect, it can build diverse and smooth movements.

Can video generation be infinitely long? Google VideoPoet large model is online, netizens: revolutionary technology

As soon as the news came out, many people welcomed it: Look at the current few finished products with good results, and the development of large model technology is too fast.

Can video generation be infinitely long? Google VideoPoet large model is online, netizens: revolutionary technology

Some people expressed surprise at the length of the video generated by this large model:

Can video generation be infinitely long? Google VideoPoet large model is online, netizens: revolutionary technology

Can video generation be infinitely long? Google VideoPoet large model is online, netizens: revolutionary technology

Source: https://twitter.com/cybersphere_ai/status/1737257729167966353

##Some people say this is a revolution A large language model of sex.

Can video generation be infinitely long? Google VideoPoet large model is online, netizens: revolutionary technology

#Some people have also called on Google to open source VideoPoet as soon as possible. The general trend waits for no one.

With the advancement of generative AI, there has been a recent wave of new video generation models that demonstrate stunning picture quality. One of the current bottlenecks in video generation is generating coherent large movements. But in many cases, even leading models can only produce smaller motions, or exhibit noticeable artifacts when producing larger motions.

In order to explore the application of language models in video generation, researchers from Google introduced a large language model (LLM) VideoPoet, which can perform various video generation tasks , including text to video, image to video, video stylization, video repair and expansion, and video to audio.

VideoPoet effect display

Text generation video

Tip: A dog listens to music with headphones, rich details, 8k.

Can video generation be infinitely long? Google VideoPoet large model is online, netizens: revolutionary technology

Tips (left to right): A shark shooting laser beams from its mouth; teddy bears walking hand in hand on Fifth Avenue on a rainy day; iron lifters chick.

Can video generation be infinitely long? Google VideoPoet large model is online, netizens: revolutionary technology

Cue (left to right): A roaring lion made of yellow dandelion petals; a massive explosion on the surface of the Earth; a horse galloping in Van Gogh's Starry Night ; A squirrel in armor rides a goose; a panda takes a selfie.

Can video generation be infinitely long? Google VideoPoet large model is online, netizens: revolutionary technology

Image to generate video

For image to video, VideoPoet can take the input image and Animate it with prompts.

#To start the Mona Lisa yawning, just enter a picture and a prompt: A woman yawns. You will get the following effect.

Can video generation be infinitely long? Google VideoPoet large model is online, netizens: revolutionary technology

Tips (from left to right): A ship sailing on a rough sea with thunderstorms and lightning, oil painting style; flying over a nebula with many twinkling stars; a man standing on a cliff on a windy day with a cane The wanderer looked down at the sea of ​​clouds floating below.

Can video generation be infinitely long? Google VideoPoet large model is online, netizens: revolutionary technology

Stylize the video

##VideoPoet can also style the input video based on text prompts Stylization.

Cue (left to right): A teddy bear skates on a clean icy lake; a metallic lion roars in the glow of a furnace.

Can video generation be infinitely long? Google VideoPoet large model is online, netizens: revolutionary technology

Generate audio

VideoPoet can also generate audio. The model is first asked to generate a 2-second clip and then attempts to predict the audio of the frame without any textual guidance. In this way, VideoPoet is able to generate video and audio from a single model.
Can video generation be infinitely long? Google VideoPoet large model is online, netizens: revolutionary technology
Long video

##VideoPoet can also generate long videos, the default is 2 seconds. This process can be repeated infinitely to generate videos of any length by adjusting the last 1 second of the video and predicting the next 1 second. Below is an example demonstration of VideoPoet generating a long video from text input. Tip: FPV footage shows a very sharp Elfstone city in the jungle, with a bright blue river, waterfalls, and large, steep vertical cliff faces.

Can video generation be infinitely long? Google VideoPoet large model is online, netizens: revolutionary technology

Expand Video

Users can change the prompt, thereby extending the video. Original video of two raccoons riding a motorcycle on a mountain road surrounded by pine trees, 8k. The expanded video shows two raccoons riding a motorcycle. A meteor falls behind the raccoons, and the meteor hits the earth and explodes.

Can video generation be infinitely long? Google VideoPoet large model is online, netizens: revolutionary technology

Interactive Video Editing

For the input video provided (far left), Users can change the movement of objects to perform different actions. As shown below, the middle three have no text prompts, and the last text prompt is: Start with smoke background.

Can video generation be infinitely long? Google VideoPoet large model is online, netizens: revolutionary technology

Video Repair
##VideoPoet can add details to the obscured parts of the video, You can also choose to repair via text guidance.

Can video generation be infinitely long? Google VideoPoet large model is online, netizens: revolutionary technology

To demonstrate the capabilities of VideoPoet, Google also created a short short film composed of multiple short clips generated by VideoPoet. The script, written by Bard, is a short story about a traveling raccoon, complete with a scene-by-scene breakdown and accompanying prompt list. Google then generated video clips for each prompt and stitched all the generated clips together to produce the final video below. Can video generation be infinitely long? Google VideoPoet large model is online, netizens: revolutionary technology

Can video generation be infinitely long? Google VideoPoet large model is online, netizens: revolutionary technologyMethod introduction
As shown in the figure below, VideoPoet can animate the input image to generate a video, and the video can be edited or expanded.
In terms of stylization, the model receives video representing depth and optical flow, drawing content in a text-guided style.

Video Generator

A key advantage of using LLM for training is that Reuse many scalable efficiency improvements introduced in existing LLM training infrastructure. However, LLM operates on discrete tokens, which makes video generation challenging. Video and audio tokenizers can be used to encode video and audio clips into sequences of discrete tokens, and can also be converted back to the original representation.

By using multiple tokenizers (MAGVIT V2 for video and images and SoundStream for audio), VideoPoet trains an autoregressive language model to learn across videos, images, Multiple modalities for audio and text. Once the model generates tokens conditioned on some context, it can use a tokenizer decoder to convert them back into a visual representation.

Can video generation be infinitely long? Google VideoPoet large model is online, netizens: revolutionary technology

Evaluation results

The research team used various benchmarks to evaluate VideoPoet in text to Performance on video generation to compare the results with other methods. To ensure neutral evaluation, the study ran all models under a variety of prompts, without cherry-picking examples, and asked human evaluators to provide preference ratings.

Can video generation be infinitely long? Google VideoPoet large model is online, netizens: revolutionary technology

Can video generation be infinitely long? Google VideoPoet large model is online, netizens: revolutionary technology

On average, 24-35% of the examples in VideoPoet are considered better than competing models in following prompts, while This ratio is 8-11% for competing models. Raters also preferred 41-54% of the examples in VideoPoet because the actions that generated the videos were more interesting, compared to 11-21% of the other models.

Reference link:
https://blog.research. google/2023/12/videopoet-large-language-model-for-zero.html
https://sites.research.google/videopoet/stylization /

The above is the detailed content of Can video generation be infinitely long? Google VideoPoet large model is online, netizens: revolutionary technology. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:jiqizhixin.com. If there is any infringement, please contact admin@php.cn delete