Weird precision diff tracing_PHP tutorial-PHP Tutorial-php.cn

Home

Backend Development

PHP Tutorial

Weird precision diff tracing_PHP tutorial

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Jul 12, 2016 am 09:01 AM

android

Weird precision diff tracking

1. Problems found in Query-diff test

Query-diff is a commonly used testing method on the retrieval end. The idea is to use A set of the same retrieval information respectively requests the baseline version and the test version of a system or module. Typically, there are only minor differences (program functionality/configuration, etc.) between the baseline version and the version under test. After sending the request, compare the search results returned by the two versions to verify whether the difference affects the final calculation result.

Weird precision diff tracing_PHP tutorial

The tested module A in this case is written in C, and the output core data is a single-precision floating point number, recorded as Q.

When performing the query-diff test after a certain upgrade of module A, it was found that there is a precision diff in the Q value, the proportion is about 1%, the maximum diff is in the decimal place, and this upgrade is expected to be diff-free.

2. In-depth investigation

Usually when diff occurs, you must first clarify the direction of the investigation. If you cannot see the reason at a glance, you need to use the elimination method to verify the suspects one by one, narrow the scope, and reduce the Unnecessary investment of energy. So two major investigation directions are listed: environment or program.

Look at the environment first:

l Carefully checked the configuration and vocabulary of the old and new environments at the environment site, and they were in line with expectations, excluding factors related to environment construction tools.

l Since this upgrade is forward compatible, the configuration and vocabulary of the old and new environments are unified, retested, and diff reproduced, eliminating configuration differences.

There seems to be no problem with the environment, let’s go back to the verification process:

l Since multiple sets of tests have been done, the verification results have not changed, ruling out the possibility of random strategy diff.

l Print the debug log and check the intermediate results of each step in the processing. There are no problems. Only diff appears in the last step of calculating the Q value. Thread dirty data, process-level cache dirty data and variable types are successively excluded. Conversion and other risk points.

l For complete confirmation, directly replace the programs in the old and new environments with the new versions and retest. If it is really caused by the program, there should be no diff. However, the diff reappears! Obviously there is no random diff? ! !

At this time, the bottleneck has been identified. The reasons for the environment and the program seem to be wrong.

Calm down and think again. The previous investigation explained the concept of environment as the configuration and vocabulary used. It was believed that if the two are the same, the environment is the same. This is one-sided. The meaning of environment should also include the compilation environment and running environment of the system and hardware. So we have a new verification idea:

l Both the old and new versions of the program are produced using the company's cloud compilation cluster, so there should be no problem. However, to avoid taking things for granted, we carefully checked the compilation parameters and re-run them on the same local machine. Compiled the old and new versions, confirmed the diff recurrence, and eliminated compilation factors;

l Copy the old and new environments to the same machine, repress the request, and the diff disappears! Confirmed to be a factor in the operating environment

The operating environment includes the operating system and hardware levels. Strike while the iron is hot and continue to investigate:

l Confirm that the operating systems of the two machines where the diff appears are consistent, both are centos 4.3, and are ruled out Operating system;

l The difference in hard disk and memory models is less likely to cause diff, so we will not verify it yet;

l The CPU version of the machine where the new environment is located is Xeon E5645, and the CPU version of the machine where the old environment is located Xeon E5-2620, suspecting that the CPU model is different, I found another machine with the same CPU as the old environment to deploy the new environment, retested, the diff disappeared, and the target was locked to the CPU.

Weird precision diff tracing_PHP tutorial

2. Revealing the truth

After analyzing the CPU, after simply excluding the number of cores, the maximum number of threads, and the first, second and third level caches, the instruction set differences in the CPU feature list caught my attention. .
Weird precision diff tracing_PHP tutorial

Supplementary knowledge 1: The role of the cpu instruction set

The instruction set is a hard program stored inside the CPU that guides and optimizes CPU operations. With these instruction sets, the CPU can run more efficiently. To explain how instruction sets are optimized, two technologies have to be mentioned: SISD (Single Instruction Single Data) and SIMD (Single Instruction Multiple Data).

Take the addition instruction as an example. After using the SISD CPU to decode the addition instruction, the execution unit first accesses the memory to obtain the first operand, and then accesses the memory again to obtain the second operand. Only then can the summation operation be performed. In a CPU using SIMD, after the instruction is decoded, several execution units access the memory at the same time and obtain all the operands at once for operation. This feature makes SIMD particularly suitable for data-intensive operations.

The SSE series and AVX in the CPU instruction set are used for floating point operations, and AVX is one of the differences between the two CPUs, which is highly suspicious. Now we need to find evidence that the program is optimized using AVX.

However, there is no directly optimized code logic in the ASQ module. Although the program involving Q value calculation calls the static libA interface, the libA code does not use the instruction set. However, libA compiled static libB, so we traced all the way to the bottom layer and found that the fourth layer of compilation dependencies was libX provided by IDL. The code was confidential and could not be viewed.

I had to ask the relevant RD for advice. The RD informed that libX does use SSE instruction optimization and the math function library MKL provided by Intel, but does not use AVX.

Weird precision diff tracing_PHP tutorial

Is this another dead end? With the last bit of hope, I checked MKL’s official introduction on Intel and found an unexpected gain: AVX optimization was introduced in MKL! 【1】

Weird precision diff tracing_PHP tutorial

Now we have the last step to confirm that AVX is the culprit of the diff source. Soon, further evidence was found in Intel's products [2]:

Weird precision diff tracing_PHP tutorial

The FMA instructions in AVX2 involve floats in matrix multiplication, dot product, polynomial evaluation, etc. The efficiency and accuracy of point operations have been improved compared to previous instruction sets, because FMA can complete multiplication and accumulation operations at one time. I also found posts from relevant technical personnel in the official forum to support [3]:

Weird precision diff tracing_PHP tutorial

Supplementary knowledge two: floating point number storage methods in computers

float and double They all comply with IEEE specifications in terms of storage methods. Float complies with IEEE R32.24, and double complies with R64.53.

Whether it is single precision or double precision, storage is divided into three parts:

1. Sign bit (Sign): 0 represents positive, 1 represents negative

2. Exponent bit (Exponent): used to store exponent data in scientific notation, and uses shift storage

3. Mantissa part (Mantissa): mantissa part

where float The storage method is as shown in the following table:

	Total length	Mantissa part	Exponent part	Sign bit
Single precision	32bit	0-22	23-30	31
Double precision	64bit	0-51	52-62	63
Extended Double	80bit	0-63	64-78	79

At the hardware level, the floating point operation logic of the CPU is implemented on the FPU (Floating Point Operation Unit) (whether SSE or AVX). The default calculation precision of the FPU is 80bit, while the float precision output by SSE and AVX Not that high (both are 32bit). If there are differences in calculation accuracy in the FPU (provided that they are both greater than 32bit), the calculated output is truncated to 32bit and then stored in the memory, which will inevitably cause a diff in the result due to approximate truncation.

Since Intel’s underlying algorithm is confidential, we can only guess that the FPU accuracy set when implementing the optimization functions of AVX and SSE is different, but the conclusion of the accuracy difference is certain.

The truth has emerged at this time: AVX’s FMA has 1 bit more accuracy than SSE. When there are iterative calculations, the difference will accumulate. The generation of the Q value undergoes complex matrix operations, and this tiny 1-bit difference is magnified to ten thousandths of a decimal point. At the same time, Intel ensures the compatibility of various machines. MKL code will be downgraded to SSE when running on a CPU that does not support AVX.

Supplementary knowledge three: Methods of using SSE and AVX to optimize programs

Still taking the addition instruction as an example, the introduction of relevant header files and preparation of compilation instructions will not be introduced here. Please refer to Related information.

Basic version:

Simple loop to accumulate and sum.

Weird precision diff tracing_PHP tutorial

SSE optimized version

SSE register 128bit, 16 bytes, can store 4 single-precision floating point numbers at a time, and can be stored in groups of 4 Register, use the built-in addition function to sum, then add the 4 group sums, and finally add the remaining items of the group to get the final result.

Weird precision diff tracing_PHP tutorial

AVX optimized version

AVX optimization method is similar to SSE, but the AVX register uses 256bit, 32 bytes, and can store 8 single-precision floating point numbers. Each group of 8 floats needs to be stored in the register.

Weird precision diff tracing_PHP tutorial

Now randomly generate the input array and write a simple test case to verify the effect of optimization. The following is a performance comparison of the three algorithms. The unit is cumulative per second. The number of floats. As a result, SSE efficiency is increased to 4 times that of the regular version, while AVX is 8 times higher! 【4】

Weird precision diff tracing_PHP tutorial

2. Summary and Enlightenment

Problem Summary:

l During the Query-diff compatibility test, it was found that module A is new or old There is a diff in the Q value calculated by the version;

l After investigation, it is determined that the accuracy diff comes from the floating point instruction set difference (AVX/SSE) supported by the program's running environment CPU

l In this case The proportion and absolute value of diff are both small. Although it currently does not affect online services, if the algorithm is further complicated and diff accumulates to the percentile, it will cause the strategy to fail.

l If the floating-point number operations of other modules use instruction set optimization, you also need to check whether the same problem exists.

Solution:

l When allocating test resources, ensure that the CPU of the machine where the new and old environments are located is consistent;

l Add an environment check mechanism before executing query-diff, and confirm again that the hardware is intact Difference;

l When deploying services online, you also need to make sure that the machine supports the AVX instruction set to achieve optimal performance and accuracy;

l Check whether other modules have similar use of instruction set optimization to avoid risks in advance.

Inspiration and suggestions:

l Floating-point operation-intensive programs can consider using instruction set functions such as SSE/AVX to optimize performance, which can usually significantly improve operating efficiency (SSE: 4 times, AVX : 8 times);

l When using the instruction set, pay attention to controlling the number of iterations (that is, the output of the instruction set function is used as the input of the instruction set function again) to avoid accumulating precision diffs to a level that cannot be ignored;

l Query-diff testing can be applied to more compatibility testing scenarios, such as comparing the impact of underlying system and hardware differences on applications such as CPU, operating system, and basic libraries.

Software engineering is inseparable from hardware support. Differences in compilation and running environments may cause differences in service performance and final calculation results. Such issues require special attention at all stages of development, testing, and launch. It is important to be a programmer who combines software and hardware!

Reference materials:

【1】 https://software.intel.com/zh-cn/articles/whats-new-in-intel-mkl

【 2】 https://software.intel.com/zh-cn/articles/intel-xeon-processor-e7-88004800-v3-product-family-technical-overview

【3】 https:// software.intel.com/en-us/forums/topic/507004

【4】 http://www.cnblogs.com/zyl910/archive/2012/10/22/simdsumfloat.html

Baidu MTC is the industry's leading mobile application testing service platform, providing solutions to the cost, technology and efficiency issues faced by developers in mobile application testing. At the same time, industry-leading Baidu technology is shared, and the authors come from Baidu employees and industry leaders.

>>If you have any questions, please feel free to communicate with me

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

How do you modify data stored in a PHP session?Apr 27, 2025 am 12:23 AM

TomodifydatainaPHPsession,startthesessionwithsession_start(),thenuse$_SESSIONtoset,modify,orremovevariables.1)Startthesession.2)Setormodifysessionvariablesusing$_SESSION.3)Removevariableswithunset().4)Clearallvariableswithsession_unset().5)Destroythe

Give an example of storing an array in a PHP session.Apr 27, 2025 am 12:20 AM

Arrays can be stored in PHP sessions. 1. Start the session and use session_start(). 2. Create an array and store it in $_SESSION. 3. Retrieve the array through $_SESSION. 4. Optimize session data to improve performance.

How does garbage collection work for PHP sessions?Apr 27, 2025 am 12:19 AM

PHP session garbage collection is triggered through a probability mechanism to clean up expired session data. 1) Set the trigger probability and session life cycle in the configuration file; 2) You can use cron tasks to optimize high-load applications; 3) You need to balance the garbage collection frequency and performance to avoid data loss.

How can you trace session activity in PHP?Apr 27, 2025 am 12:10 AM

Tracking user session activities in PHP is implemented through session management. 1) Use session_start() to start the session. 2) Store and access data through the $_SESSION array. 3) Call session_destroy() to end the session. Session tracking is used for user behavior analysis, security monitoring, and performance optimization.

How can you use a database to store PHP session data?Apr 27, 2025 am 12:02 AM

Using databases to store PHP session data can improve performance and scalability. 1) Configure MySQL to store session data: Set up the session processor in php.ini or PHP code. 2) Implement custom session processor: define open, close, read, write and other functions to interact with the database. 3) Optimization and best practices: Use indexing, caching, data compression and distributed storage to improve performance.

Explain the concept of a PHP session in simple terms.Apr 26, 2025 am 12:09 AM

PHPsessionstrackuserdataacrossmultiplepagerequestsusingauniqueIDstoredinacookie.Here'showtomanagethemeffectively:1)Startasessionwithsession_start()andstoredatain$_SESSION.2)RegeneratethesessionIDafterloginwithsession_regenerate_id(true)topreventsessi

How do you loop through all the values stored in a PHP session?Apr 26, 2025 am 12:06 AM

In PHP, iterating through session data can be achieved through the following steps: 1. Start the session using session_start(). 2. Iterate through foreach loop through all key-value pairs in the $_SESSION array. 3. When processing complex data structures, use is_array() or is_object() functions and use print_r() to output detailed information. 4. When optimizing traversal, paging can be used to avoid processing large amounts of data at one time. This will help you manage and use PHP session data more efficiently in your actual project.

Explain how to use sessions for user authentication.Apr 26, 2025 am 12:04 AM

The session realizes user authentication through the server-side state management mechanism. 1) Session creation and generation of unique IDs, 2) IDs are passed through cookies, 3) Server stores and accesses session data through IDs, 4) User authentication and status management are realized, improving application security and user experience.

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

1 months agoByDDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

3 weeks agoByDDD

Where to find the Crane Control Keycard in Atomfall

1 months agoByDDD

How to fix KB5055523 fails to install in Windows 11?

2 weeks agoByDDD

InZoi: How To Apply To School And University

3 weeks agoByDDD

Hot Tools

Dreamweaver CS6

Visual web development tools

SublimeText3 Chinese version

Chinese version, very easy to use

Notepad++7.3.1

Easy-to-use and free code editor

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

Hot Topics

Where is the login entrance for gmail email?

7756

1643

1399

1293

1234