Home > Article > Backend Development > Performance comparison of using PHP built-in functions and custom functions to deduplicate arrays
array_unique() is the built-in function with the best performance for deduplicating arrays. The hash table method has the best performance for custom functions. The hash value is used as the key and the value is empty. The round-robin method is simple to implement but inefficient. It is recommended to use built-in or custom functions for deduplication. array_unique() takes 0.02 seconds, array_reverse array_filter() takes 0.04 seconds, the hash table method takes 0.01 seconds, and the round-robin method takes 0.39 seconds.
Introduction
Deduplication arrays It refers to removing duplicate elements in an array and retaining unique values. PHP provides a number of built-in and custom functions to do this. This article will compare the performance of these functions and provide practical examples.
Built-in function
array_unique()
: Built-in function, which uses a hash table to remove duplicates, which is more efficient. array_reverse()
array_filter()
: Use array_reverse()
to reverse the array, and then combine it with array_filter()
to shift Remove duplicate elements. Custom function
Practical case
Suppose we have an array $array
containing 1 million integers.
$array = range(1, 1000000); $iterations = 100;
Performance test
function test_array_unique($array, $iterations) { $total_time = 0; for ($i = 0; $i < $iterations; $i++) { $start_time = microtime(true); $result = array_unique($array); $end_time = microtime(true); $total_time += $end_time - $start_time; } $avg_time = $total_time / $iterations; echo "array_unique: $avg_time seconds\n"; } function test_array_reverse_array_filter($array, $iterations) { $total_time = 0; for ($i = 0; $i < $iterations; $i++) { $start_time = microtime(true); $result = array_filter(array_reverse($array), 'array_unique'); $end_time = microtime(true); $total_time += $end_time - $start_time; } $avg_time = $total_time / $iterations; echo "array_reverse + array_filter: $avg_time seconds\n"; } function test_hash_table($array, $iterations) { $total_time = 0; for ($i = 0; $i < $iterations; $i++) { $start_time = microtime(true); $result = array_values(array_filter($array, function ($value) { static $hash_table = []; if (isset($hash_table[$value])) { return false; } $hash_table[$value] = true; return true; })); $end_time = microtime(true); $total_time += $end_time - $start_time; } $avg_time = $total_time / $iterations; echo "hash table: $avg_time seconds\n"; } function test_loop($array, $iterations) { $total_time = 0; for ($i = 0; $i < $iterations; $i++) { $start_time = microtime(true); $result = array_values(array_filter($array, function ($value) use (&$array) { for ($j = 0; $j < count($array); $j++) { if ($j == $i) { continue; } if ($value == $array[$j]) { return false; } } return true; })); $end_time = microtime(true); $total_time += $end_time - $start_time; } $avg_time = $total_time / $iterations; echo "loop: $avg_time seconds\n"; } test_array_unique($array, $iterations); test_array_reverse_array_filter($array, $iterations); test_hash_table($array, $iterations); test_loop($array, $iterations);
Result
Average running time of each function using an array of 1 million integers As follows:
Conclusion
According to the test results, array_unique()
is the fastest built-in function for deduplicating arrays, while the hash table method It is a custom function with the best performance. Although the round-robin method is easy to implement, it is less efficient. When dealing with large arrays, it is recommended to use array_unique()
or the hash table method for deduplication.
The above is the detailed content of Performance comparison of using PHP built-in functions and custom functions to deduplicate arrays. For more information, please follow other related articles on the PHP Chinese website!