Subvert three concepts! Google's latest research: Is it more accurate to calculate 'similarity' with a poor-performance model?-AI-php.cn

Subvert three concepts! Google's latest research: Is it more accurate to calculate 'similarity' with a poor-performance model?

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Apr 12, 2023 pm 10:25 PM

Modelcalculate

CalculatingThe similarity between images is an open problem in computer vision.

Today, when image generation is popular all over the world, How to define "similarity" is also a key issue in evaluating the authenticity of generated images.

Although there are some relatively direct methods to calculate image similarity, such as measuring the difference in pixels (such as FSIM, SSIM), this method obtains The difference in similarity is far from the difference perceived by the human eye.

After the rise of deep learning, some researchers found that the intermediate representation obtained by some neural network classifiers, such as AlexNet, VGG, SqueezeNet, etc. after training on ImageNet can Used as a computation of perceptual similarity.

In other words, embedding is closer to people’s perception of the similarity of multiple images than pixels.

Subvert three concepts! Googles latest research: Is it more accurate to calculate similarity with a poor-performance model?

Of course, this is just a hypothesis.

Recently Google published a paper specifically studying whether the ImageNet classifier can better evaluate perceptual similarity.

Subvert three concepts! Googles latest research: Is it more accurate to calculate similarity with a poor-performance model?

Paper link: https://openreview.net/pdf?id=qrGKGZZvH0

Although there has been work on the BAPPS data set released in 2018, perceptual scores were studied on the first generation ImageNet classifier , In order to further evaluate the correlation between accuracy and perceptual score, as well as the impact of various hyperparameters, the research results of the latest ViT model are added to the paper.

The higher the accuracy, the worse the perceived similarity?

As we all know, the features learned through training on ImageNet can be well transferred to many downstream tasks and improve the performance of downstream tasks, which also makes pre-training on ImageNet a standard operation.

Additionally, achieving higher accuracy on ImageNet often means better performance on a diverse set of downstream tasks, such as robustness to damaged images, Generalization performance to out-of-distribution data and transfer learning to smaller categorical data sets.

But in terms of perceptual similarity calculation, everything seems to be reversed.

Models that achieve high accuracy on ImageNet have worse perceptual scores, while those with "mid-range" scores perform best on the perceptual similarity task.

Subvert three concepts! Googles latest research: Is it more accurate to calculate similarity with a poor-performance model?

ImageNet 64 × 64 validation accuracy (x-axis), Perceptual score on 64 × 64 BAPPS dataset (y-axis), Each blue dot represents an ImageNet classifier

It can be seen that the better ImageNet classifier achieves a better perceptual score to a certain extent, but beyond a certain Threshold, increasing the accuracy will reduce the perceptual score. The accuracy of the classifier is moderate (20.0-40.0), and the best perceptual score can be obtained. The article also studies the impact of neural network hyperparameters on perceptual scores, such as width, depth, number of training steps, weight attenuation, label smoothing and dropout

For each hyperparameter, there is an optimal accuracy, and increasing the accuracy can improve the perceptual score, but this optimal value is quite low and is reached very early in the hyperparameter sweep.

In addition to this, improvements in classifier accuracy lead to worse perceptual scores.

As an example, the article gives the changes in perceptual scores relative to two hyperparameters: training steps in ResNets and width in ViTs.

Early-stopped ResNets achieved the best perceptual scores at different depth settings of 6, 50 and 200

ResNet-50 and ResNet The perceptual score of -200 reaches the highest value in the first few epochs of training, but after the peak, the perceptual score value of the better performing classifier drops more sharply.

The results show that the training and learning rate adjustment of ResNets can improve the accuracy of the model as the step increases. Likewise, after the peak, the model also exhibits a progressive decrease in perceptual similarity scores that matches this progressively increasing accuracy.

ViTs consists of a set of Transformer blocks applied to the input image. The width of the ViT model is the number of output neurons of a single Transformer block. Increasing the width can effectively improve the accuracy of the model.

The researchers obtained two models B/8 (i.e. Base-ViT model, patch size is 4) and L/4 (i.e. Large -ViT model) and evaluate accuracy and perceptual scores.

The results are again similar to those observed for early-stopping ResNets, with narrower ViTs with lower accuracy performing better than the default width.

Subvert three concepts! Googles latest research: Is it more accurate to calculate similarity with a poor-performance model?

However, the optimal widths of ViT-B/8 and ViT-L/4 are 6% and 12% of their default widths respectively, paper A more detailed list of experiments on other hyperparameters such as width, depth, number of training steps, weight decay, label smoothing and dropout across ResNet and ViTs is also provided.

So if you want to improve the perceived similarity, the strategy is simple, just reduce the accuracy appropriately.

Subvert three concepts! Googles latest research: Is it more accurate to calculate similarity with a poor-performance model?

Improving the perceptual score by scaling down the ImageNet model, the values in the table represent the values given by scaling on the model with default hyperparameters Improvements obtained from models with fixed hyperparameters

Based on the above conclusion, the paper proposes a simple strategy to improve the perceptual score of the architecture: shrink the model to reduce accuracy, until Achieve optimal perception score.

Also visible in the experimental results is the perceptual score improvement obtained by scaling down each model on each hyperparameter. Early stopping yields the highest score improvement across all architectures except ViT-L/4, and early stopping is the most effective strategy without the need for time-consuming grid searches.

Global perceptual function

In previous work, the perceptual similarity function was calculated using the Euclidean distance across the image space dimensions.

This approach assumes a direct correspondence between pixels, but this correspondence may not apply to curved, translated, or rotated images.

In this article, the researchers adopted two perceptual functions that rely on the global representation of the image, namely neural style transfer that captures the style similarity between two images. style loss function and normalized average pooling distance function.

The style loss function compares the inter-channel cross-correlation matrix between two images, while the average pooling function compares the spatially averaged global representation.

Subvert three concepts! Googles latest research: Is it more accurate to calculate similarity with a poor-performance model?

The global perceptual function consistently improves the perceptual score for both network training with default hyperparameters and ResNet-200 as a function of training epochs

We also explore some hypotheses to explain the relationship between accuracy and perceptual ratings and derive some additional insights.

For example, model accuracy without the commonly used skip connection is also inversely proportional to the perceptual score, with layers closer to the output having on average lower perceptual scores compared to layers closer to the input .

We also further explored distortion sensitivity, ImageNet category granularity and spatial frequency sensitivity.

In short, this paper explores the issue of whether improving classification accuracy will produce better perceptual metrics. It studies the relationship between accuracy and perceptual scores on ResNets and ViTs under different hyperparameters, and finds that perceptual scores are related to Accuracy shows an inverted U-shaped relationship, in which accuracy and perception scores are related to a certain extent, showing an inverted U-shaped relationship.

Finally, the article discusses the relationship between accuracy and perceptual score in detail, including skip connection, global similarity function, distortion sensitivity, hierarchical perceptual score, spatial frequency sensitivity and ImageNet Category granularity.

While the exact explanation for the trade-off between ImageNet accuracy and perceptual similarity remains a mystery, this paper is a first step forward.

The above is the detailed content of Subvert three concepts! Google's latest research: Is it more accurate to calculate 'similarity' with a poor-performance model?. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete

Can't use ChatGPT! Explaining the causes and solutions that can be tested immediately [Latest 2025]May 14, 2025 am 05:04 AM

ChatGPT is not accessible? This article provides a variety of practical solutions! Many users may encounter problems such as inaccessibility or slow response when using ChatGPT on a daily basis. This article will guide you to solve these problems step by step based on different situations. Causes of ChatGPT's inaccessibility and preliminary troubleshooting First, we need to determine whether the problem lies in the OpenAI server side, or the user's own network or device problems. Please follow the steps below to troubleshoot: Step 1: Check the official status of OpenAI Visit the OpenAI Status page (status.openai.com) to see if the ChatGPT service is running normally. If a red or yellow alarm is displayed, it means Open

Calculating The Risk Of ASI Starts With Human MindsMay 14, 2025 am 05:02 AM

On 10 May 2025, MIT physicist Max Tegmark told The Guardian that AI labs should emulate Oppenheimer’s Trinity-test calculus before releasing Artificial Super-Intelligence. “My assessment is that the 'Compton constant', the probability that a race to

An easy-to-understand explanation of how to write and compose lyrics and recommended tools in ChatGPTMay 14, 2025 am 05:01 AM

AI music creation technology is changing with each passing day. This article will use AI models such as ChatGPT as an example to explain in detail how to use AI to assist music creation, and explain it with actual cases. We will introduce how to create music through SunoAI, AI jukebox on Hugging Face, and Python's Music21 library. Through these technologies, everyone can easily create original music. However, it should be noted that the copyright issue of AI-generated content cannot be ignored, and you must be cautious when using it. Let’s explore the infinite possibilities of AI in the music field together! OpenAI's latest AI agent "OpenAI Deep Research" introduces: [ChatGPT]Ope

What is ChatGPT-4? A thorough explanation of what you can do, the pricing, and the differences from GPT-3.5!May 14, 2025 am 05:00 AM

The emergence of ChatGPT-4 has greatly expanded the possibility of AI applications. Compared with GPT-3.5, ChatGPT-4 has significantly improved. It has powerful context comprehension capabilities and can also recognize and generate images. It is a universal AI assistant. It has shown great potential in many fields such as improving business efficiency and assisting creation. However, at the same time, we must also pay attention to the precautions in its use. This article will explain the characteristics of ChatGPT-4 in detail and introduce effective usage methods for different scenarios. The article contains skills to make full use of the latest AI technologies, please refer to it. OpenAI's latest AI agent, please click the link below for details of "OpenAI Deep Research"

Explaining how to use the ChatGPT app! Japanese support and voice conversation functionMay 14, 2025 am 04:59 AM

ChatGPT App: Unleash your creativity with the AI assistant! Beginner's Guide The ChatGPT app is an innovative AI assistant that handles a wide range of tasks, including writing, translation, and question answering. It is a tool with endless possibilities that is useful for creative activities and information gathering. In this article, we will explain in an easy-to-understand way for beginners, from how to install the ChatGPT smartphone app, to the features unique to apps such as voice input functions and plugins, as well as the points to keep in mind when using the app. We'll also be taking a closer look at plugin restrictions and device-to-device configuration synchronization

How do I use the Chinese version of ChatGPT? Explanation of registration procedures and feesMay 14, 2025 am 04:56 AM

ChatGPT Chinese version: Unlock new experience of Chinese AI dialogue ChatGPT is popular all over the world, did you know it also offers a Chinese version? This powerful AI tool not only supports daily conversations, but also handles professional content and is compatible with Simplified and Traditional Chinese. Whether it is a user in China or a friend who is learning Chinese, you can benefit from it. This article will introduce in detail how to use ChatGPT Chinese version, including account settings, Chinese prompt word input, filter use, and selection of different packages, and analyze potential risks and response strategies. In addition, we will also compare ChatGPT Chinese version with other Chinese AI tools to help you better understand its advantages and application scenarios. OpenAI's latest AI intelligence

5 AI Agent Myths You Need To Stop Believing NowMay 14, 2025 am 04:54 AM

These can be thought of as the next leap forward in the field of generative AI, which gave us ChatGPT and other large-language-model chatbots. Rather than simply answering questions or generating information, they can take action on our behalf, inter

An easy-to-understand explanation of the illegality of creating and managing multiple accounts using ChatGPTMay 14, 2025 am 04:50 AM

Efficient multiple account management techniques using ChatGPT | A thorough explanation of how to use business and private life! ChatGPT is used in a variety of situations, but some people may be worried about managing multiple accounts. This article will explain in detail how to create multiple accounts for ChatGPT, what to do when using it, and how to operate it safely and efficiently. We also cover important points such as the difference in business and private use, and complying with OpenAI's terms of use, and provide a guide to help you safely utilize multiple accounts. OpenAI

See all articles