UC Berkeley Google innovates LLM, implements terminal diffusion model and uses it for IGN to generate realistic images in a single step, and American TV series become a source of inspiration-AI-php.cn

UC Berkeley Google innovates LLM, implements terminal diffusion model and uses it for IGN to generate realistic images in a single step, and American TV series become a source of inspiration

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Nov 14, 2023 am 08:30 AM

Modeldiffusionign

The diffusion model that has become popular in half the sky will be eliminated?

Currently, generative AI models, such as GAN, diffusion model or consistency model, generate images by mapping inputs to outputs corresponding to the target data distribution. The content that needs to be rewritten is :

Normally, this kind of model needs to learn a lot of real pictures, and then it can try to ensure the real features of the generated pictures. The content that needs to be rewritten is:

Recently, researchers from UC Berkeley and Google proposed a new generation model-Impotent Generative Network (IGN). The content that needs to be rewritten is:

UC Berkeley Google innovates LLM, implements terminal diffusion model and uses it for IGN to generate realistic images in a single step, and American TV series become a source of inspiration Picture

Paper address: https://arxiv.org/abs/2311.01462

IGNs can be selected from a variety of Inputs, such as random noise, simple graphics, etc., generate realistic images in a single step without the need for multi-step iterations. What needs to be rewritten is:

This model aims to be A "global projector" can map any input data to the target data distribution. The content that needs to be rewritten is:

In short, the general image generation model must be What needs to be rewritten is this:

Interestingly, a highly effective scene in "Seinfeld" actually became the author's source of inspiration. What needs to be rewritten is:

UC Berkeley Google innovates LLM, implements terminal diffusion model and uses it for IGN to generate realistic images in a single step, and American TV series become a source of inspiration Picture

This scene well summarizes the concept of "idempotent operator", which refers to During the operation, if the same input is repeatedly operated, the result will always be the same. The content that needs to be rewritten is:

, that is,

UC Berkeley Google innovates LLM, implements terminal diffusion model and uses it for IGN to generate realistic images in a single step, and American TV series become a source of inspiration Picture

The content that needs to be rewritten is:

As Jerry Seinfeld humorously pointed out, some real-life behaviors can also be considered The idempotent content that needs to be rewritten is:

Impotent Generating Network

IGN has two important differences with GAN and diffusion model:

- Different from GAN, IGN does not require separate generators and discriminators. It is a "self-confrontation" model. The content that needs to be rewritten to complete generation and discrimination at the same time is:

- Unlike diffusion models that perform incremental steps, IGN attempts to map inputs to data distributions in a single step. What needs to be rewritten is:

What is the origin of IGN (idempotent generative model)?

It is trained to be from the source distribution UC Berkeley Google innovates LLM, implements terminal diffusion model and uses it for IGN to generate realistic images in a single step, and American TV series become a source of inspiration Given the target distribution of the input samples , the generated samples need to be rewritten The content is:

Given the example data set UC Berkeley Google innovates LLM, implements terminal diffusion model and uses it for IGN to generate realistic images in a single step, and American TV series become a source of inspiration , each example is taken from The content is: Then, the researchers trained the model to map to . The content that needs to be rewritten is: UC Berkeley Google innovates LLM, implements terminal diffusion model and uses it for IGN to generate realistic images in a single step, and American TV series become a source of inspiration

Assume that the distributions UC Berkeley Google innovates LLM, implements terminal diffusion model and uses it for IGN to generate realistic images in a single step, and American TV series become a source of inspiration and are located in the same space, i.e. their instances have the same dimensions. What needs to be rewritten is: This allows UC Berkeley Google innovates LLM, implements terminal diffusion model and uses it for IGN to generate realistic images in a single step, and American TV series become a source of inspiration Applies to two types of instances and The content that needs to be rewritten is:

The figure shows the basic idea behind IGN: the real example (x) is invariant to the model f UC Berkeley Google innovates LLM, implements terminal diffusion model and uses it for IGN to generate realistic images in a single step, and American TV series become a source of inspiration The content that needs to be rewritten is: other inputs (z) are mapped to f By optimizing , the content that needs to be rewritten on the instance stream mapped to itself is:

UC Berkeley Google innovates LLM, implements terminal diffusion model and uses it for IGN to generate realistic images in a single step, and American TV series become a source of inspiration Picture

IGN training routine PyTorch code example that needs to be rewritten is:

UC Berkeley Google innovates LLM, implements terminal diffusion model and uses it for IGN to generate realistic images in a single step, and American TV series become a source of inspiration ##Picture

Experimental results

After getting IGN, what is the effect?

The author admits that at this stage, the generated results of IGN cannot compete with the most advanced models. The content that needs to be rewritten is:

At In the experiment, a smaller model and a lower-resolution data set were used, and the main focus in the exploration was on the simplified method. The content that needs to be rewritten is:

Of course, the basic generation Modeling technologies, such as GAN and diffusion models, also took a long time to achieve mature and large-scale performance. The content that needs to be rewritten is:

Experimental settings

The researchers evaluated IGN on MNIST (greyscale handwritten digits dataset) and CelebA (face image dataset), using image resolutions of 28×28 and 64×64 respectively. The content is:

The author uses a simple autoencoder architecture, where the encoder is a simple five-layer discriminator backbone from DCGAN, and the decoder is the generator. The content that needs to be rewritten is : The training and network hyperparameters are shown in Table 1. The content that needs to be rewritten is:

UC Berkeley Google innovates LLM, implements terminal diffusion model and uses it for IGN to generate realistic images in a single step, and American TV series become a source of inspiration Picture

Generate result

Figure 4 shows the qualitative results for the two data sets after applying the model once and twice consecutively. What needs to be rewritten is:

As shown, applying IGN once (f (z)) will produce coherent generation results. What needs to be rewritten is: However, artifacts may occur, such as holes in MNIST digits, or the top of the head in facial images. The distorted pixels of hair and hair need to be rewritten:

Applying f (f (f (z))) again can correct these problems, fill holes, or reduce facial noise patches The total changes around what needs to be rewritten are:

UC Berkeley Google innovates LLM, implements terminal diffusion model and uses it for IGN to generate realistic images in a single step, and American TV series become a source of inspiration Picture

Figure 7 shows the additional results and applying f three times As a result, the content that needs to be rewritten is:

UC Berkeley Google innovates LLM, implements terminal diffusion model and uses it for IGN to generate realistic images in a single step, and American TV series become a source of inspiration Picture

##Comparing UC Berkeley Google innovates LLM, implements terminal diffusion model and uses it for IGN to generate realistic images in a single step, and American TV series become a source of inspiration and shows that when the image is close to the learned manifold When , applying f again results in minimal changes, as the image is considered distributed. What needs to be rewritten is:

Latent Space Manipulation

The author proves by performing operations that IGN has a consistent latent space, similar to that shown for GAN. Figure 6 shows that the latent space algorithm needs to be rewritten as:

UC Berkeley Google innovates LLM, implements terminal diffusion model and uses it for IGN to generate realistic images in a single step, and American TV series become a source of inspiration Picture

Out-of-distribution mapping

The author also verified that by converting data from various distributions The image is input into the model to generate its equivalent "natural image" to verify the potential of IGN's "global mapping". The content that needs to be rewritten is:

The researchers passed the noisy image x n denoising, colorizing the grayscale image UC Berkeley Google innovates LLM, implements terminal diffusion model and uses it for IGN to generate realistic images in a single step, and American TV series become a source of inspiration , and converting the sketch to the real image in Figure 5 to prove this point needs to be rewritten is:

Original image x, these inverse tasks are ill-posed. What needs to be rewritten is: IGN can create a natural mapping that conforms to the original image structure. What needs to be rewritten is:

As shown, applying f continuously can improve image quality (for example, it removes dark and smoke artifacts in projected sketches) What needs to be rewritten is:

UC Berkeley Google innovates LLM, implements terminal diffusion model and uses it for IGN to generate realistic images in a single step, and American TV series become a source of inspiration Pictures

Google Next?

It can be seen from the above results that IGN is more effective in inference and can generate results in a single step after training. The content that needs to be rewritten is:

They can also output more consistent results, which may be extended to more applications, such as medical image repair. The content that needs to be rewritten is:

The author of the paper stated:

We view this work as a first step toward models that learn to map arbitrary inputs to target distributions, a new paradigm in generative modeling that needs to be rewritten. The content is:

Next, the research team plans to expand the scale of IGN with more data, hoping to tap the full potential of new generative AI models that need to be rewritten. The content is:

The latest research code will be published on GitHub in the future. The content that needs to be rewritten is:

References:

https://www.php.cn/link/2bd388f731f26312bfc0fe30da009595

https://www .php.cn/link/e1e4e65fddf79af60aab04457a6565a6

The above is the detailed content of UC Berkeley Google innovates LLM, implements terminal diffusion model and uses it for IGN to generate realistic images in a single step, and American TV series become a source of inspiration. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete

Most Used 10 Power BI Charts - Analytics VidhyaApr 16, 2025 pm 12:05 PM

Harnessing the Power of Data Visualization with Microsoft Power BI Charts In today's data-driven world, effectively communicating complex information to non-technical audiences is crucial. Data visualization bridges this gap, transforming raw data i

Expert Systems in AIApr 16, 2025 pm 12:00 PM

Expert Systems: A Deep Dive into AI's Decision-Making Power Imagine having access to expert advice on anything, from medical diagnoses to financial planning. That's the power of expert systems in artificial intelligence. These systems mimic the pro

Three Of The Best Vibe Coders Break Down This AI Revolution In CodeApr 16, 2025 am 11:58 AM

First of all, it’s apparent that this is happening quickly. Various companies are talking about the proportions of their code that are currently written by AI, and these are increasing at a rapid clip. There’s a lot of job displacement already around

Runway AI's Gen-4: How Can AI Montage Go Beyond AbsurdityApr 16, 2025 am 11:45 AM

The film industry, alongside all creative sectors, from digital marketing to social media, stands at a technological crossroad. As artificial intelligence begins to reshape every aspect of visual storytelling and change the landscape of entertainment

How to Enroll for 5 Days ISRO AI Free Courses? - Analytics VidhyaApr 16, 2025 am 11:43 AM

ISRO's Free AI/ML Online Course: A Gateway to Geospatial Technology Innovation The Indian Space Research Organisation (ISRO), through its Indian Institute of Remote Sensing (IIRS), is offering a fantastic opportunity for students and professionals to

Local Search Algorithms in AIApr 16, 2025 am 11:40 AM

Local Search Algorithms: A Comprehensive Guide Planning a large-scale event requires efficient workload distribution. When traditional approaches fail, local search algorithms offer a powerful solution. This article explores hill climbing and simul

OpenAI Shifts Focus With GPT-4.1, Prioritizes Coding And Cost EfficiencyApr 16, 2025 am 11:37 AM

The release includes three distinct models, GPT-4.1, GPT-4.1 mini and GPT-4.1 nano, signaling a move toward task-specific optimizations within the large language model landscape. These models are not immediately replacing user-facing interfaces like

The Prompt: ChatGPT Generates Fake PassportsApr 16, 2025 am 11:35 AM

Chip giant Nvidia said on Monday it will start manufacturing AI supercomputers— machines that can process copious amounts of data and run complex algorithms— entirely within the U.S. for the first time. The announcement comes after President Trump si

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Best Graphic Settings

4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows: Seashell Riddle Solution

2 weeks agoByDDD

R.E.P.O. How to Fix Audio if You Can't Hear Anyone

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Chat Commands and How to Use Them

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Atom editor mac version download

The most popular open source editor

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.