


The diffusion model that has become popular in half the sky will be eliminated?
Currently, generative AI models, such as GAN, diffusion model or consistency model, generate images by mapping inputs to outputs corresponding to the target data distribution. The content that needs to be rewritten is :
Normally, this kind of model needs to learn a lot of real pictures, and then it can try to ensure the real features of the generated pictures. The content that needs to be rewritten is:
Recently, researchers from UC Berkeley and Google proposed a new generation model-Impotent Generative Network (IGN). The content that needs to be rewritten is:
Picture
Paper address: https://arxiv.org/abs/2311.01462
IGNs can be selected from a variety of Inputs, such as random noise, simple graphics, etc., generate realistic images in a single step without the need for multi-step iterations. What needs to be rewritten is:
This model aims to be A "global projector" can map any input data to the target data distribution. The content that needs to be rewritten is:
In short, the general image generation model must be What needs to be rewritten is this:
Interestingly, a highly effective scene in "Seinfeld" actually became the author's source of inspiration. What needs to be rewritten is:
Picture
This scene well summarizes the concept of "idempotent operator", which refers to During the operation, if the same input is repeatedly operated, the result will always be the same. The content that needs to be rewritten is:
, that is,
Picture
The content that needs to be rewritten is:
As Jerry Seinfeld humorously pointed out, some real-life behaviors can also be considered The idempotent content that needs to be rewritten is:
Impotent Generating Network
IGN has two important differences with GAN and diffusion model:
- Different from GAN, IGN does not require separate generators and discriminators. It is a "self-confrontation" model. The content that needs to be rewritten to complete generation and discrimination at the same time is:
- Unlike diffusion models that perform incremental steps, IGN attempts to map inputs to data distributions in a single step. What needs to be rewritten is:
What is the origin of IGN (idempotent generative model)?
It is trained to be from the source distribution Given the target distribution of the input samples
, the generated samples need to be rewritten The content is:
Given the example data set, each example is taken from The content is: Then, the researchers trained the model
to map
to
. The content that needs to be rewritten is:
Assume that the distributions and
are located in the same space, i.e. their instances have the same dimensions. What needs to be rewritten is: This allows
Applies to two types of instances
and
The content that needs to be rewritten is:
The figure shows the basic idea behind IGN: the real example (x) is invariant to the model fThe content that needs to be rewritten is: other inputs (z) are mapped to f By optimizing
, the content that needs to be rewritten on the instance stream mapped to itself is:
Picture
IGN training routine PyTorch code example that needs to be rewritten is:
##Picture
After getting IGN, what is the effect?
The author admits that at this stage, the generated results of IGN cannot compete with the most advanced models. The content that needs to be rewritten is:
At In the experiment, a smaller model and a lower-resolution data set were used, and the main focus in the exploration was on the simplified method. The content that needs to be rewritten is:
Of course, the basic generation Modeling technologies, such as GAN and diffusion models, also took a long time to achieve mature and large-scale performance. The content that needs to be rewritten is:
Experimental settings
The researchers evaluated IGN on MNIST (greyscale handwritten digits dataset) and CelebA (face image dataset), using image resolutions of 28×28 and 64×64 respectively. The content is:
The author uses a simple autoencoder architecture, where the encoder is a simple five-layer discriminator backbone from DCGAN, and the decoder is the generator. The content that needs to be rewritten is : The training and network hyperparameters are shown in Table 1. The content that needs to be rewritten is:
Picture
Generate result
Figure 4 shows the qualitative results for the two data sets after applying the model once and twice consecutively. What needs to be rewritten is:
As shown, applying IGN once (f (z)) will produce coherent generation results. What needs to be rewritten is: However, artifacts may occur, such as holes in MNIST digits, or the top of the head in facial images. The distorted pixels of hair and hair need to be rewritten:
Applying f (f (f (z))) again can correct these problems, fill holes, or reduce facial noise patches The total changes around what needs to be rewritten are:
Picture
Figure 7 shows the additional results and applying f three times As a result, the content that needs to be rewritten is:
Picture
##Comparing and
shows that when the image is close to the learned manifold When , applying f again results in minimal changes, as the image is considered distributed. What needs to be rewritten is:
Latent Space Manipulation
The author proves by performing operations that IGN has a consistent latent space, similar to that shown for GAN. Figure 6 shows that the latent space algorithm needs to be rewritten as:
Picture
Out-of-distribution mapping
The author also verified that by converting data from various distributions The image is input into the model to generate its equivalent "natural image" to verify the potential of IGN's "global mapping". The content that needs to be rewritten is:
The researchers passed the noisy image x n denoising, colorizing the grayscale image, and converting the sketch
to the real image in Figure 5 to prove this point needs to be rewritten is:
Original image x, these inverse tasks are ill-posed. What needs to be rewritten is: IGN can create a natural mapping that conforms to the original image structure. What needs to be rewritten is:
As shown, applying f continuously can improve image quality (for example, it removes dark and smoke artifacts in projected sketches) What needs to be rewritten is:
Pictures
It can be seen from the above results that IGN is more effective in inference and can generate results in a single step after training. The content that needs to be rewritten is:
They can also output more consistent results, which may be extended to more applications, such as medical image repair. The content that needs to be rewritten is:
The author of the paper stated:
We view this work as a first step toward models that learn to map arbitrary inputs to target distributions, a new paradigm in generative modeling that needs to be rewritten. The content is:
Next, the research team plans to expand the scale of IGN with more data, hoping to tap the full potential of new generative AI models that need to be rewritten. The content is:
The latest research code will be published on GitHub in the future. The content that needs to be rewritten is:
References:
https://www.php.cn/link/2bd388f731f26312bfc0fe30da009595
https://www .php.cn/link/e1e4e65fddf79af60aab04457a6565a6
The above is the detailed content of UC Berkeley Google innovates LLM, implements terminal diffusion model and uses it for IGN to generate realistic images in a single step, and American TV series become a source of inspiration. For more information, please follow other related articles on the PHP Chinese website!

Harnessing the Power of Data Visualization with Microsoft Power BI Charts In today's data-driven world, effectively communicating complex information to non-technical audiences is crucial. Data visualization bridges this gap, transforming raw data i

Expert Systems: A Deep Dive into AI's Decision-Making Power Imagine having access to expert advice on anything, from medical diagnoses to financial planning. That's the power of expert systems in artificial intelligence. These systems mimic the pro

First of all, it’s apparent that this is happening quickly. Various companies are talking about the proportions of their code that are currently written by AI, and these are increasing at a rapid clip. There’s a lot of job displacement already around

The film industry, alongside all creative sectors, from digital marketing to social media, stands at a technological crossroad. As artificial intelligence begins to reshape every aspect of visual storytelling and change the landscape of entertainment

ISRO's Free AI/ML Online Course: A Gateway to Geospatial Technology Innovation The Indian Space Research Organisation (ISRO), through its Indian Institute of Remote Sensing (IIRS), is offering a fantastic opportunity for students and professionals to

Local Search Algorithms: A Comprehensive Guide Planning a large-scale event requires efficient workload distribution. When traditional approaches fail, local search algorithms offer a powerful solution. This article explores hill climbing and simul

The release includes three distinct models, GPT-4.1, GPT-4.1 mini and GPT-4.1 nano, signaling a move toward task-specific optimizations within the large language model landscape. These models are not immediately replacing user-facing interfaces like

Chip giant Nvidia said on Monday it will start manufacturing AI supercomputers— machines that can process copious amounts of data and run complex algorithms— entirely within the U.S. for the first time. The announcement comes after President Trump si


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Atom editor mac version download
The most popular open source editor

MinGW - Minimalist GNU for Windows
This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

EditPlus Chinese cracked version
Small size, syntax highlighting, does not support code prompt function

Dreamweaver Mac version
Visual web development tools

Notepad++7.3.1
Easy-to-use and free code editor