Home >Technology peripherals >AI >roll! MIT Poisson flow generation model beats diffusion model, taking into account both quality and speed
The diffusion model was originally derived from thermodynamics in physics, but recently it has shone in the field of artificial intelligence. What other physical theories can promote the development of generative model research? Recently, researchers from MIT were inspired by high-dimensional electromagnetic theory and proposed a generative model called Poisson Flow. Theoretically, this model has intuitive images and rigorous theory; experimentally, it is often better than the diffusion model in terms of generation quality, generation speed and robustness. This article has been accepted by NeurIPS 2022.
Inspired by electrostatic mechanics, the researchers proposed a new generative model called Poisson flow model (Poisson Flow Generative Models, or PFGM). Intuitively, this research can regard the N-dimensional data points as a group of positive charges on the z=0 plane, a new dimension in the N 1-dimensional space. They generate an electric field in the high-dimensional space. Starting from the z=0 plane and moving outward along the electric field lines they generate, the study was able to deliver the sample to a hemisphere (as shown in Figure 1). The direction of these electric field lines corresponds to the gradient of the solution to the Poisson Equation in high-dimensional space. The researchers proved that when the radius of the hemisphere is large enough, the electric field lines can transform the charge distribution (that is, the data distribution) on the z=0 plane into a uniform distribution on the hemisphere (Figure 2).
PFGM takes advantage of the reversibility of electric field lines to generate data distribution on the z=0 plane: first, researchers sample uniformly on a large hemisphere, and then let the sample follow the electric field lines Move from the sphere to the z=0 plane to generate data. Since motion along electric field lines can be described by an ordinary differential equation (ODE), in actual sampling researchers only need to solve an ODE that is determined by the direction of the electric field lines. Through an electric field, PFGM converts a simple distribution on a sphere into a complex data distribution. From this perspective, PFGM can be considered as a continuous normalizing flow (Normalizing Flow).
In the image generation experiment, PFGM is currently the best performing standardized flow model on the standard data set CIFAR-10, achieving It achieved an FID score (a measure of picture quality) of 2.35. The researchers also demonstrated other uses of PFGM, such as its ability to calculate image likelihood, perform image editing, and scale to high-resolution image data sets. In addition, researchers found that PFGM has three advantages over the recently popular diffusion models: (1) In On the same network structure, the sample quality generated by PFGM's ODE is much better than that of the diffusion model's ODE; (2) While the quality of the SDE (stochastic differential equation) generated by the diffusion model is similar, the ODE of PFGM reaches 10 times - 20 times acceleration;
(3) PFGM is more robust than the diffusion model on network structures with weaker expressive capabilities.
Figure 1: The sample point moves along the electric field line. Above: The data distribution is in the shape of a heart; below: The data is distributed in the shape of a PFGM
##Figure 2: Left: the trajectory of the Poisson field in three dimensions; right: forward ODE and reverse ODE using PFGM on the image
Notice that the above process embeds N-dimensional data into N 1-dimensional (extra z-dimensional) space. In order to facilitate the distinction, researchers use x and to represent N-dimensional data and N 1 dimensions. In order to obtain the above-mentioned high-dimensional electric field lines, the following Poisson equation needs to be solved: where is located z=0 is the data distribution to be generated on the plane; is the potential function, which is the goal of the researcher's solution. Since only the direction of the electric field lines needed to be known, the researchers derived the analytical form of the gradient of the electric field lines (the gradient of the potential function): Electric Field The trajectory of the line (see Figure 2) can be described by the following ODE: In the following theorem, the researchers proved the above ODE definition It represents a bijection of the uniform distribution on a high-dimensional hemisphere and the data distribution on the z=0 plane. This conclusion is the same as the intuition in Figures 1 and 2: the data distribution can be restored through electric field lines. Training of PFGM Given a data distribution The data set was sampled. The researchers used the electric field line gradient corresponding to the data set to approximate the electric field line gradient corresponding to the data distribution: The electric field line gradient is the learning target. This study uses the perturb function to select points in the space, and the square loss function allows the neural network to learn the normalized electric field line gradient## in the space. #, the specific algorithm is as follows: Sampling of PFGM After learning the normalization to learn the normalized electric field line gradient in the space, the data distribution can be sampled through the following ODE: This ODE gradually moves the sample from the large sphere along the electric field lines to the z=0 plane by reducing z. In addition, this study proposes to project the uniform distribution on a large sphere onto a certain z-plane to facilitate ODE simulations and further accelerate sampling through variable substitution. Please refer to Section 3.3 of the article for specific steps. In Table 1, this study uses the standard dataset CIFAR-10 to evaluate different models. On this dataset, PFGM is the best performing reversible normalized flow model, achieving an FID score of 2.35. PFGM performs better than the diffusion model using the same network structure (DDPM /DDPM deep). The researchers also observed that while the SDE (stochastic differential equation) generation quality of the diffusion model was similar, PFGM achieved an acceleration of 10 times - 20 times, better balancing the generation quality and speed. In addition, researchers found that PFGM is more robust than diffusion models on less expressive network structures, and is still better than diffusion models under the same conditions on higher-dimensional data sets. Please see the experimental section of the article for details. In Figure 3, the study visualizes the process of PFGM generating images. Table 1: Sample quality (FID, Inception) and number of sampling steps (NFE) on CIFAR-10 data #Figure 3: Sampling process of PFGM on CIFAR-10, CelebA 64x64, LSUN bedroom 256x256 This study proposed a Poisson-based The generative model PFGM of Eq. This model predicts the normalized electric field line gradients in an extended space of N 1 dimensions and is sampled by the corresponding ODEs of the electric field lines. In experiments, the model studied in this study is currently the best standardized flow model, and has achieved better generation effects and faster sampling speeds than the diffusion model on the same network structure. The sampling process of PFGM is more robust to noise and can also be extended to higher dimensional data sets. Researchers expect PFGM to also perform well in other application areas, such as molecule generation and 3D data generation. Method Overview
Experimental results
The above is the detailed content of roll! MIT Poisson flow generation model beats diffusion model, taking into account both quality and speed. For more information, please follow other related articles on the PHP Chinese website!