search
HomeTechnology peripheralsAIImage generation based on Diffusion Model

Part 01

##● Development History

1.1 Origin

In 2015, it was proposed in the article Deep Unsupervised Learning using Nonequilibrium Thermodynamics that the generative models at that time, such as VAE, had a big difficulty. This type of model first defines the conditional distribution, and then defines the variational posterior for adaptation. In the end, it will be necessary to optimize the conditional distribution and the variational posteriori at the same time. However, this is very difficult. If we can define a simple process that maps the data distribution to a standard Gaussian, the task of the "generator" becomes simply fitting each small step of the inverse process of this process. This is the core idea of ​​the diffusion model. . However, this article did not make any waves at the time.

1.2 Development

In 2020, based on the thoughts of predecessors, the DDPM model (Denoising Diffusion Probabilistic Models), compared to the basic diffusion model, the author combines the diffusion model and denoising scores to guide the training and sampling process, which brings about an appropriate improvement in the generated image samples, making it easier and more stable to train. , the final result is comparable to the GAN model.

Image generation based on Diffusion Model

Figure 2-Generation results of DDPM

However, the DDPM model is not perfect. Since the diffusion process is a Markov chain, its disadvantage is that it requires a relatively large number of diffusion steps to obtain better results, which results in very slow sample generation.

So after DDPM, in 2021, Song and others proposed DDIM (Denoising Diffusioin Implicit Model), which transformed the diffusion process of DDPM The sampling method extends the traditional Markov diffusion process to a non-Markov process, and can use smaller sampling steps to accelerate sample generation, greatly improving efficiency.

There are also some improvements in the follow-up work to integrate the diffusion model with the traditional generative network, such as the combination of VAE and DM models, the combination of GAN DM, etc. , I will not go into details here.

1.3 Outbreak

In 2022, Google launched a new AI system based on the diffusion model that can Text descriptions turned into realistic images.

Image generation based on Diffusion Model

image 3

Image generation based on Diffusion Model

Figure 4

It can be seen from the schematic diagram provided by Google that the input text is first encoded, and then converted into a 64*64 small image by a text-to-image diffusion model. Further, the small image is processed using a super-resolution diffusion model. , the resolution of the image is improved in the further iteration process, and the final generated result is obtained - a final image of 1024*1024. This magical process is just like what everyone feels when using it. You enter a piece of text - a golden retriever dog wearing a red dotted turtleneck and a blue checkered hat, and then the program automatically generates the above text for you. Pictures of dogs seen.

Another popular phenomenon-level application - novalAI, this was originally a website dedicated to AI writing. Based on the current hot image generation, it combines image resources on the Internet to train An image generation model focusing on two dimensions has been developed, and the effect has begun to reach the level of human painters.

Image generation based on Diffusion Model

Figure 5


In addition to the traditional inputting of text to produce pictures, it also supports inputting pictures as reference, allowing AI to generate new ones based on known pictures. pictures, which to a certain extent solves the problem of uncontrollable AI-generated results.

Part 02

##● Principle Explanation

So, what is the working process of such a powerful AI technology? Here we take the more classic DDPM model as an example to give a brief process:

2.1 Forward process

The forward process is a process of adding noise to the image in order to construct training sample GT.

For the given initial data distribution x0~q(x), we gradually add Gaussian noise to the data distribution. This process has T times, each step The result is x1,

##As mentioned above, this is a Markov chain process. Eventually, the data will tend to be an isotropic Gaussian distribution. Image generation based on Diffusion Model2.2 Inverse diffusion process

The reverse process is a denoising process. If you know

, x0 can be restored from the complete standard Gaussian distribution. It has been proved that if

satisfies the Gaussian distribution and Image generation based on Diffusion Model is small enough, then is still a Gaussian distribution, and then Image generation based on Diffusion Model cannot be simply inferred, so we use a The deep learning model with parameters Image generation based on Diffusion ModelImage generation based on Diffusion Model# is used to predict it, so there is: Image generation based on Diffusion Model

Image generation based on Diffusion Model

If x0 is known, then through Bayesian formula:

Image generation based on Diffusion Model


#2.3 Training process

Readers who know something about machine learning should know that all model training is to optimize the parameters of the model to obtain reliable mean and variance. We maximize the logarithm of the model's prediction distribution. Likelihood, that is:

Image generation based on Diffusion Model

##After a series of derivation, The DDPM model obtained the final loss function expression:

Image generation based on Diffusion Model


## Summarize the training process:

    1.
  • Get Input x0, randomly sample a t
  • from 1...T 2. Sample a noise from the standard Gaussian distribution
  • Image generation based on Diffusion Model
  • 3.
  • Calculate the loss and iteratively minimize the loss function

Image generation based on Diffusion Model

Figure 6


Part 03

Summary

The diffusion model has shown great potential. Compared with the VAE model, it does not need to align the posterior distribution, nor does it need to train an additional discriminator like GAN. Including computer vision, bioinformatics, and speech processing It has applications in image generation and other aspects. Its application in image generation will help improve the efficiency of image creation. It may allow AI to generate several pictures based on conditions, and humans can filter and modify the results. This will be a new trend in the field of 2D painting in the future. Working mode, which may greatly improve the production efficiency of 2D digital assets.

However, with the development of AI technology, there will always be some disputes, and the field of image generation is no exception. In addition to problems with the AI ​​technology itself, such as the generated image structure being wrong and unreasonable, It is also accompanied by some legal disputes, such as the copyright issues of the AI ​​works themselves. Technical problems can be solved through the development of the technology itself. We have reason to believe that with the development of AI technology, image generation will eventually reach a very high level, which will eliminate most low-end painting-related jobs, which will greatly Liberate human productivity. Copyright issues may still require government departments to pay enough attention to the development of related industries and improve relevant policies and systems. This requires us to think more about emerging fields so that AI technology can better serve us.

References

https://www.php.cn/link/3799b2e805a7fa8b076fc020574a73b2

https://www.php.cn/link/6872937617af85db5a39a5243e858d1f

##​

https://www.php.cn/link/831da40e5907987235ebe5616446e083

The above is the detailed content of Image generation based on Diffusion Model. For more information, please follow other related articles on the PHP Chinese website!

Statement
This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete
A Comprehensive Guide to ExtrapolationA Comprehensive Guide to ExtrapolationApr 15, 2025 am 11:38 AM

Introduction Suppose there is a farmer who daily observes the progress of crops in several weeks. He looks at the growth rates and begins to ponder about how much more taller his plants could grow in another few weeks. From th

The Rise Of Soft AI And What It Means For Businesses TodayThe Rise Of Soft AI And What It Means For Businesses TodayApr 15, 2025 am 11:36 AM

Soft AI — defined as AI systems designed to perform specific, narrow tasks using approximate reasoning, pattern recognition, and flexible decision-making — seeks to mimic human-like thinking by embracing ambiguity. But what does this mean for busine

Evolving Security Frameworks For The AI FrontierEvolving Security Frameworks For The AI FrontierApr 15, 2025 am 11:34 AM

The answer is clear—just as cloud computing required a shift toward cloud-native security tools, AI demands a new breed of security solutions designed specifically for AI's unique needs. The Rise of Cloud Computing and Security Lessons Learned In th

3 Ways Generative AI Amplifies Entrepreneurs: Beware Of Averages!3 Ways Generative AI Amplifies Entrepreneurs: Beware Of Averages!Apr 15, 2025 am 11:33 AM

Entrepreneurs and using AI and Generative AI to make their businesses better. At the same time, it is important to remember generative AI, like all technologies, is an amplifier – making the good great and the mediocre, worse. A rigorous 2024 study o

New Short Course on Embedding Models by Andrew NgNew Short Course on Embedding Models by Andrew NgApr 15, 2025 am 11:32 AM

Unlock the Power of Embedding Models: A Deep Dive into Andrew Ng's New Course Imagine a future where machines understand and respond to your questions with perfect accuracy. This isn't science fiction; thanks to advancements in AI, it's becoming a r

Is Hallucination in Large Language Models (LLMs) Inevitable?Is Hallucination in Large Language Models (LLMs) Inevitable?Apr 15, 2025 am 11:31 AM

Large Language Models (LLMs) and the Inevitable Problem of Hallucinations You've likely used AI models like ChatGPT, Claude, and Gemini. These are all examples of Large Language Models (LLMs), powerful AI systems trained on massive text datasets to

The 60% Problem — How AI Search Is Draining Your TrafficThe 60% Problem — How AI Search Is Draining Your TrafficApr 15, 2025 am 11:28 AM

Recent research has shown that AI Overviews can cause a whopping 15-64% decline in organic traffic, based on industry and search type. This radical change is causing marketers to reconsider their whole strategy regarding digital visibility. The New

MIT Media Lab To Put Human Flourishing At The Heart Of AI R&DMIT Media Lab To Put Human Flourishing At The Heart Of AI R&DApr 15, 2025 am 11:26 AM

A recent report from Elon University’s Imagining The Digital Future Center surveyed nearly 300 global technology experts. The resulting report, ‘Being Human in 2035’, concluded that most are concerned that the deepening adoption of AI systems over t

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
WWE 2K25: How To Unlock Everything In MyRise
1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

SublimeText3 Linux new version

SublimeText3 Linux new version

SublimeText3 Linux latest version

SAP NetWeaver Server Adapter for Eclipse

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

VSCode Windows 64-bit Download

VSCode Windows 64-bit Download

A free and powerful IDE editor launched by Microsoft

Dreamweaver Mac version

Dreamweaver Mac version

Visual web development tools

Atom editor mac version download

Atom editor mac version download

The most popular open source editor