Home  >  Article  >  Backend Development  >  Efficiency comparison of Python lists and sets

Efficiency comparison of Python lists and sets

WBOY
WBOYforward
2023-04-12 11:13:051899browse

Efficiency comparison of Python lists and sets

Program operating efficiency

The operating efficiency of the program is divided into two types: the first is time efficiency, and the second is space efficiency. Time efficiency is called time complexity, while space efficiency is called space complexity. Time complexity mainly measures the running speed of a program, while space complexity mainly measures the additional storage space required by a program.

Theoretically speaking, the time it takes to execute a program cannot be calculated. Only when you run the program on a machine can you know that the results obtained by different machines at different times may be different. . But do we need to test every program on the computer? It is obviously unrealistic, so the analysis method of time complexity is developed. In practice, when we calculate the time complexity, we do not necessarily need to calculate the exact number of executions, but only the approximate number of executions. We generally use the Big O asymptotic representation. If the number of executions is usually 1, we can say the time complexity. It is O(1). If it takes n times, it can be said that the time complexity is O(n).

Space complexity is a measure of the amount of storage space an algorithm temporarily occupies during operation. Space complexity is not how many bytes of space the program occupies, because it is difficult to calculate during actual operation, so space complexity counts the number of variables. The space complexity calculation rules are basically similar to the time complexity, and also use the big O asymptotic notation.

The most commonly used Python combined data types are tuples, lists, sets and dictionaries. For the time complexity of different operations of each data type, please refer to Python's official link. There are detailed instructions on the web page,

  • https://wiki.python.org/moin/TimeComplexity

Tuples and lists are sequence types, and their storage mechanisms are basically the same; sets and dictionaries are also basically the same , the only difference is that there is no corresponding value for each element of the collection. Next, we take sets and lists as examples to see their search efficiency and storage overhead.

Data search efficiency

How big is the gap between the efficiency of collection and list data search? Let’s look at a set of examples first:

import time
import random
nums = [random.randint(0, 2000000) for i in range(1000)]
list_test = list(range(1000000))
set_test = set(list_test)
count_list, count_set = 0, 0
t1 = time.time()# 测试在列表中进行查找
for num in nums:
 if num in list_test:
 count_list += 1
t2 = time.time()
for num in nums:# 测试在集合中进行查找
 if num in set_test:
 count_set += 1
t3 = time.time()# 测试在集合中进行查找
print('找到个数,列表:{},集合:{}'.format(count_list, count_set))
print('使用时间,列表:{:.4f}s'.format(t2 - t1))
print('使用时间,集合:{:.4f}s'.format(t3 - t2))

The output result is:

找到个数,列表:515,集合:515
使用时间,列表:7.7953s
使用时间,集合:0.0010s

It can be clearly seen from the above example that the search efficiency of sets is much higher than that of lists, so in different application scenarios , be sure to choose the appropriate data type. You can't see the performance difference under a small amount of data. Once you switch to a large amount of data, the difference will become very different.

Data storage overhead

The search efficiency of sets is much faster than that of lists. The main reason is that their storage principles are different. Sets need to consume more space to store additional information, and use space Overhead comes in exchange for time efficiency. Next, we use the getsizeof() function to see the difference in their storage overhead. The getiszeof() function is a function used in python's sys module to obtain the object memory size. The returned size is in bytes.

import sys
import random
list_test = list(range(1000000))
set_test = set(range(1000000))
print('列表占用大小:', sys.getsizeof(list_test))
print('集合占用大小:', sys.getsizeof(set_test))

The output result is:

列表占用大小:9000112
集合占用大小:33554656

It can be seen from the results that for the same data content, the cost of collection storage is several times that of the list.

The above is the detailed content of Efficiency comparison of Python lists and sets. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:51cto.com. If there is any infringement, please contact admin@php.cn delete