Home  >  Article  >  Backend Development  >  Performance comparison of using PHP built-in functions and custom functions to deduplicate arrays

Performance comparison of using PHP built-in functions and custom functions to deduplicate arrays

WBOY
WBOYOriginal
2024-04-26 21:09:01521browse

array_unique() is the built-in function with the best performance for deduplicating arrays. The hash table method has the best performance for custom functions. The hash value is used as the key and the value is empty. The round-robin method is simple to implement but inefficient. It is recommended to use built-in or custom functions for deduplication. array_unique() takes 0.02 seconds, array_reverse array_filter() takes 0.04 seconds, the hash table method takes 0.01 seconds, and the round-robin method takes 0.39 seconds.

使用 PHP 内置函数和自定义函数去重数组的性能对比

Performance comparison of PHP built-in functions and custom functions for deduplication arrays

Introduction

Deduplication arrays It refers to removing duplicate elements in an array and retaining unique values. PHP provides a number of built-in and custom functions to do this. This article will compare the performance of these functions and provide practical examples.

Built-in function

  • array_unique(): Built-in function, which uses a hash table to remove duplicates, which is more efficient.
  • array_reverse() array_filter(): Use array_reverse() to reverse the array, and then combine it with array_filter() to shift Remove duplicate elements.

Custom function

  • Hash table method: Create a hash table with keys as values ​​in the array , the value is empty. Iterate over the array, adding each value to the hash table. The deduplicated array is the key of the hash table.
  • Loop method: Use two pointers to traverse the array. Pointer 1 is responsible for the outer loop, and pointer 2 is responsible for the inner loop. If the value of the outer pointer is not within the value of the inner pointer, the value is added to the result array.

Practical case

Suppose we have an array $array containing 1 million integers.

$array = range(1, 1000000);
$iterations = 100;

Performance test

function test_array_unique($array, $iterations) {
  $total_time = 0;
  for ($i = 0; $i < $iterations; $i++) {
    $start_time = microtime(true);
    $result = array_unique($array);
    $end_time = microtime(true);
    $total_time += $end_time - $start_time;
  }
  $avg_time = $total_time / $iterations;
  echo "array_unique: $avg_time seconds\n";
}

function test_array_reverse_array_filter($array, $iterations) {
  $total_time = 0;
  for ($i = 0; $i < $iterations; $i++) {
    $start_time = microtime(true);
    $result = array_filter(array_reverse($array), 'array_unique');
    $end_time = microtime(true);
    $total_time += $end_time - $start_time;
  }
  $avg_time = $total_time / $iterations;
  echo "array_reverse + array_filter: $avg_time seconds\n";
}

function test_hash_table($array, $iterations) {
  $total_time = 0;
  for ($i = 0; $i < $iterations; $i++) {
    $start_time = microtime(true);
    $result = array_values(array_filter($array, function ($value) {
      static $hash_table = [];
      if (isset($hash_table[$value])) {
        return false;
      }
      $hash_table[$value] = true;
      return true;
    }));
    $end_time = microtime(true);
    $total_time += $end_time - $start_time;
  }
  $avg_time = $total_time / $iterations;
  echo "hash table: $avg_time seconds\n";
}

function test_loop($array, $iterations) {
  $total_time = 0;
  for ($i = 0; $i < $iterations; $i++) {
    $start_time = microtime(true);
    $result = array_values(array_filter($array, function ($value) use (&$array) {
      for ($j = 0; $j < count($array); $j++) {
        if ($j == $i) {
          continue;
        }
        if ($value == $array[$j]) {
          return false;
        }
      }
      return true;
    }));
    $end_time = microtime(true);
    $total_time += $end_time - $start_time;
  }
  $avg_time = $total_time / $iterations;
  echo "loop: $avg_time seconds\n";
}

test_array_unique($array, $iterations);
test_array_reverse_array_filter($array, $iterations);
test_hash_table($array, $iterations);
test_loop($array, $iterations);

Result

Average running time of each function using an array of 1 million integers As follows:

  • array_unique: 0.02 seconds
  • array_reverse array_filter: 0.04 seconds
  • Hash table method: 0.01 seconds
  • Round robin method: 0.39 seconds

Conclusion

According to the test results, array_unique() is the fastest built-in function for deduplicating arrays, while the hash table method It is a custom function with the best performance. Although the round-robin method is easy to implement, it is less efficient. When dealing with large arrays, it is recommended to use array_unique() or the hash table method for deduplication.

The above is the detailed content of Performance comparison of using PHP built-in functions and custom functions to deduplicate arrays. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn