Home  >  Article  >  Technology peripherals  >  Erase blemishes and wrinkles with one click: in-depth interpretation of DAMO Academy’s high-definition portrait skin beauty model ABPN

Erase blemishes and wrinkles with one click: in-depth interpretation of DAMO Academy’s high-definition portrait skin beauty model ABPN

PHPz
PHPzforward
2023-04-12 12:25:031609browse

With the vigorous development of the digital cultural industry, artificial intelligence technology has begun to be widely used in the field of image editing and beautification. Among them, portrait skin beautification is undoubtedly one of the most widely used and most demanded technologies. Traditional beauty algorithms use filter-based image editing technology to achieve automated skin resurfacing and blemish removal effects, and have been widely used in social networking, live broadcasts and other scenarios.

However, in the professional photography industry with high thresholds, due to the high requirements for image resolution and quality standards, manual retouchers are still the main productive force in portrait beauty retouching. , complete a series of tasks including skin leveling, blemish removal, whitening, etc. Usually, the average processing time for a professional retoucher to perform skin beautification operations on a high-definition portrait is 1-2 minutes. In fields such as advertising, film and television, which require higher accuracy, the processing time will be longer.

Compared with skin resurfacing in interactive entertainment scenes, advertising-level and studio-level refined skin beautification brings higher requirements and challenges to the algorithm. On the one hand, there are many types of blemishes, including acne, acne marks, freckles, uneven skin tone, etc. The algorithm needs to adaptively process different blemishes; on the other hand, in the process of removing blemishes, the texture of the skin needs to be preserved as much as possible , texture, and achieve high-precision skin modification; last but not least, with the continuous iteration of photographic equipment, the image resolution commonly used in professional photography has reached 4K or even 8K, which poses great challenges to the processing efficiency of the algorithm. Stringent requirements.

Therefore, with the starting point of realizing professional-level intelligent skin beautification, we have developed a set of ultra-fine local image retouching algorithms ABPN for high-definition images. Very good results and applications have been achieved in clothing wrinkle removal tasks.

  • ##Paper: https://openaccess.thecvf.com/content/CVPR2022/papers/Lei_ABPN_Adaptive_Blend_Pyramid_Network_for_Real-Time_Local_Retouching_of_CVPR_2022_paper.pdf
  • Model & code: https://www.modelscope.cn/models/damo/cv_unet_skin-retouching/summary
Related work

##3.1 Traditional beauty algorithm

#The core of the traditional beauty algorithm is to make the pixels in the skin area smoother and reduce the conspicuousness of flaws, thereby making the skin look smoother. Generally speaking, existing beautification algorithms can be divided into three steps: 1) image filtering algorithm, 2) image fusion, and 3) sharpening. The overall process is as follows:

Erase blemishes and wrinkles with one click: in-depth interpretation of DAMO Academy’s high-definition portrait skin beauty model ABPN

In order to smooth the skin area while retaining the edges in the image, the traditional beauty algorithm first uses an edge-preserving filter (such as bilateral filtering, guided filtering, etc.) to process the image. Different from the commonly used mean filter and Gaussian filter, the edge-preserving filter takes into account the changes in pixel values ​​in different areas, and adopts different weights for the edge parts with large pixel changes and the pixels in the middle area with gentle changes, thereby achieving image edges. reserve. Then, in order not to affect the background area, segmentation detection algorithms are usually used to locate the skin area and guide the fusion of the original image and the smoothed image. Finally, sharpening can further enhance edge prominence and sensory clarity. The following picture shows the effect of the current traditional beauty algorithm:

Erase blemishes and wrinkles with one click: in-depth interpretation of DAMO Academy’s high-definition portrait skin beauty model ABPN

##The original image comes from unsplash [31]From the effect point of view, the traditional beauty algorithm has two major problems: 1) The processing of defects is non-adaptive and cannot handle different types of defects well. 2) Smoothing processing causes the loss of skin texture and texture. These problems are particularly noticeable in high-definition images.

3.2 Existing deep learning algorithm

In order to achieve adaptive modification of different skin areas and different flaws, based on Data-driven deep learning algorithms appear to be a better solution. Considering the relevance of the task, we discussed and compared the applicability of four existing methods: Image-to-Image Translation, Photo Retouching, Image Inpainting, and High-resolution Image Editing for skin beautification tasks.

  • 3.2.1 Image-to-Image Translation

Image-to-Image Translation The task was initially started by pix2pix [1 ], which summarizes a large number of computer vision tasks into pixel-to-pixel prediction tasks, and proposes a general framework based on conditional generative adversarial networks to solve such problems. Based on pix2pix [1], various methods have been proposed to solve the image translation problem, including methods using paired images [2, 3, 4, 5] and methods using unpaired images. Methods [6,7,8,9]. Some work focuses on certain specific image translation tasks (such as semantic image synthesis [2, 3, 5], style transfer, etc. [9, 10, 11, 12]) and has achieved impressive results. However, most of the above image translations mainly focus on the overall transformation of image to image and lack attention to local areas, which limits their performance in skin beautification tasks.

  • 3.2.2 Photo Retouching

##Benefiting from the development of deep convolutional neural networks, learning-based methods[ 13,14,15,16] has shown excellent results in the field of image retouching in recent years. However, similar to most image translation methods, existing retouching algorithms mainly focus on manipulating some overall properties of the image, such as color, lighting, exposure, etc. Little attention is paid to the retouching of local areas, and skin retouching is exactly a local retouching task (Local Photo Retouching), which requires retouching the target area while keeping the background area unchanged.

  • 3.2.3 Image Inpainting
##Image completion (image inpainting) algorithm is often used to fill in the missing parts of the image Completion generation is very similar to the skin beautification task. With powerful feature learning capabilities, methods based on deep generative networks [17, 18, 19, 20] have made great progress in inpainting tasks in recent years. However, inpainting methods rely on the mask of the target area as input, and in skin beautification and other local modification tasks, obtaining an accurate target area mask itself is a very challenging task. Therefore, most image inpainting tasks cannot be directly used for skin beautification. In recent years, some blind image inpainting methods [21, 22, 23] have gotten rid of their dependence on masks and achieved automatic detection and completion of target areas. Nevertheless, like most other image inpainting methods, these methods have two problems: a) lack of full utilization of texture and semantic information of the target area, and b) large computational complexity and difficulty in applying to ultra-high-resolution images.

    3.2.4 High-resolution Image Editing
In order to achieve high-resolution image editing, [15, 24, 25, 26] and other methods reduce the burden of space and time by transferring the main computational load from high-resolution images to low-resolution images. Despite achieving excellent performance in terms of efficiency, most of these methods are not suitable for local modification tasks such as skin beautification due to the lack of attention to local areas. In summary, most of the existing deep learning methods are difficult to be directly applied to skin beautification tasks. The main reason is that they lack attention to local areas or require large amounts of calculations and are difficult to apply to high-resolution images.

Local retouching framework based on adaptive blending pyramid

The essence of skin beauty lies in the editing of images. Unlike most other image conversion tasks, this Editing is partial. Similar tasks include wrinkle removal on clothing and product modification. This type of local image retouching task has strong commonality. We summarize its three main difficulties and challenges: 1) Accurate positioning of the target area. 2) Local generation (modification) with global consistency and detail fidelity. 3) Ultra-high resolution image processing. To this end, we propose a local retouching framework based on Adaptive Blend Pyramid (ABPN: Adaptive Blend Pyramid Network for Real-Time Local Retouching of Ultra High-Resolution Photo, CVPR2022,[27]) to achieve ultra-high resolution For refined local retouching of images, we will introduce its implementation details below.

4.1 Overall network structure

As shown in the figure above, the network structure mainly consists of two parts: context-aware local modification layer (LRL) and adaptive blending pyramid layer (BPL). The purpose of LRL is to locally modify the downsampled low-resolution image and generate a low-resolution modification result image, fully considering the global context information and local texture information. Further, BPL is used to gradually upscale the low-resolution results generated in LRL to high-resolution results. Among them, we designed an adaptive blending module (ABM) and its reverse module (R-ABM). Using the intermediate blending layer Bi, we can achieve adaptive conversion and upward expansion between the original image and the result image, showing a powerful scalability and detail fidelity capabilities. We conducted a large number of experiments in the two data sets of facial modification and clothing modification, and the results show that our method is significantly ahead of existing methods in terms of effectiveness and efficiency. It is worth mentioning that our model achieves real-time inference of 4K ultra-high-resolution images on a single card P100. Below, we introduce LRL, BPL and network training loss respectively.

4.2 Context-aware Local Retouching Layer

In LRL, we Want to solve the two challenges mentioned in Part 3: precise positioning of the target area and local generation with global consistency. As shown in Figure 3, LRL consists of a shared encoder, mask prediction branch (MPB), and local modification branch (LRB).

Erase blemishes and wrinkles with one click: in-depth interpretation of DAMO Academy’s high-definition portrait skin beauty model ABPN

Erase blemishes and wrinkles with one click: in-depth interpretation of DAMO Academy’s high-definition portrait skin beauty model ABPN

In general, we use a multi-tasking structure to achieve explicit target areas Prediction,guidance with local modification. Among them, the structure of the shared encoder can use the joint training of the two branches to optimize features and improve the modification branch's global semantic information and local perception of the target. Most image translation methods use the traditional encoder-decoder structure to directly implement local editing without decoupling target positioning and generation, thus limiting the generation effect (the capacity of the network is limited). In contrast, multi-branch structures It is more conducive to task decoupling and mutual benefit. In the local modification branch LRB, we designed LAM (Figure 4), which uses the spatial attention mechanism and the feature attention mechanism simultaneously to achieve full fusion of features and capture of the semantics and texture of the target area. The ablation experiment (Figure 6) demonstrates the effectiveness of each module design.

4.3 Adaptive Blend Pyramid Layer

LRL is implemented at low resolution For local retouching, how to extend the retouching results to high resolution while enhancing its detail fidelity? This is the problem we want to solve in this part.

  • 4.3.1 Adaptive Blend Module

##In the field of image editing, blending layers (Blend layer) is often used to mix with the image (base layer) in different modes to achieve various image editing tasks, such as contrast enhancement, deepening, and lightening operations, etc. Usually, given a picture , and a blending layer Erase blemishes and wrinkles with one click: in-depth interpretation of DAMO Academy’s high-definition portrait skin beauty model ABPNErase blemishes and wrinkles with one click: in-depth interpretation of DAMO Academy’s high-definition portrait skin beauty model ABPN, we can blend the two layers to get the image editing result , as follows: Erase blemishes and wrinkles with one click: in-depth interpretation of DAMO Academy’s high-definition portrait skin beauty model ABPN

Erase blemishes and wrinkles with one click: in-depth interpretation of DAMO Academy’s high-definition portrait skin beauty model ABPN

where f is a fixed pixel-by-pixel mapping function, usually determined by the blending mode. Limited by the conversion capability, a specific blending mode and fixed function f are difficult to be directly applied to a variety of editing tasks. In order to better adapt to the distribution of data and the conversion modes of different tasks, we drew on the soft light mode commonly used in image editing and designed an adaptive blending module (ABM), as follows:

Erase blemishes and wrinkles with one click: in-depth interpretation of DAMO Academy’s high-definition portrait skin beauty model ABPN

represents Hadmard product, and are learnable parameters, which are used by all ABM modules in the network and the following Shared by the R-ABM module, represents a constant matrix with all values ​​​​1.

  • 4.3.2 Reverse Adaptive Blend Module

, in order to obtain the hybrid layer B, we solve formula 3 and construct a reverse adaptive blending module (R-ABM), as follows:

Erase blemishes and wrinkles with one click: in-depth interpretation of DAMO Academy’s high-definition portrait skin beauty model ABPN

In general, by using the mixed layer as an intermediary, the ABM module and the R-ABM module realize the adaptive conversion between the image I and the result R. Compared with directly using convolution on the low-resolution result Upsampling and other operations are expanded upward (such as Pix2PixHD). We use the hybrid layer to achieve this goal, which has two advantages: 1) In the local modification task, the hybrid layer mainly records the local part between the two images. Transform information, meaning it contains less irrelevant information and is easier to optimize by a lightweight network. 2) The blending layer acts directly on the original image to achieve the final modification, which can make full use of the information of the image itself, thereby achieving a high degree of detail fidelity.

Erase blemishes and wrinkles with one click: in-depth interpretation of DAMO Academy’s high-definition portrait skin beauty model ABPN

In fact, there are many alternative functions or strategies for the adaptive hybrid module. We discuss the design motivation and other solutions in the paper. The comparison is introduced in detail and will not be elaborated here. Figure 7 shows the ablation comparison between our method and other hybrid methods.

4.3.3 Refining Module

Erase blemishes and wrinkles with one click: in-depth interpretation of DAMO Academy’s high-definition portrait skin beauty model ABPN

##4.4 Loss function

Erase blemishes and wrinkles with one click: in-depth interpretation of DAMO Academy’s high-definition portrait skin beauty model ABPN

Experimental results

5.1 Comparison with SOTA method

Erase blemishes and wrinkles with one click: in-depth interpretation of DAMO Academy’s high-definition portrait skin beauty model ABPN

5.2 Ablation experiment

Erase blemishes and wrinkles with one click: in-depth interpretation of DAMO Academy’s high-definition portrait skin beauty model ABPN

##5.3 Running speed And memory consumption

Erase blemishes and wrinkles with one click: in-depth interpretation of DAMO Academy’s high-definition portrait skin beauty model ABPNEffect display

Skin beauty effect display:

Erase blemishes and wrinkles with one click: in-depth interpretation of DAMO Academy’s high-definition portrait skin beauty model ABPN

Original image from unsplash [31]

Erase blemishes and wrinkles with one click: in-depth interpretation of DAMO Academy’s high-definition portrait skin beauty model ABPN

##The original image comes from the face data set FFHQ [32]

It can be seen that compared with the traditional beauty algorithm, the local retouching framework we proposed fully retains the texture and texture of the skin while removing skin defects, achieving fine and intelligent Skin texture optimization. Further, we extended this method to the field of clothing wrinkle removal and achieved good results, as follows:

Erase blemishes and wrinkles with one click: in-depth interpretation of DAMO Academy’s high-definition portrait skin beauty model ABPN


Erase blemishes and wrinkles with one click: in-depth interpretation of DAMO Academy’s high-definition portrait skin beauty model ABPN


##

The above is the detailed content of Erase blemishes and wrinkles with one click: in-depth interpretation of DAMO Academy’s high-definition portrait skin beauty model ABPN. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:51cto.com. If there is any infringement, please contact admin@php.cn delete