另一方面,編寫不好的程式碼可能會導致執行時間變慢、能源消耗增加以及基礎設施成本更高。例如,在 Web 應用程式中,低效的程式碼可能會減慢頁面載入速度,導致使用者體驗不佳,並可能導致使用者流失。在資料處理任務中,低效的演算法會顯著增加處理大型資料集所需的時間,從而延遲關鍵的見解和決策。


讓我們探索 10 種 Python 程式最佳化技術,它們可以幫助您編寫更有效率、效能更高的程式碼。這些技術對於開發滿足效能要求同時保持可擴展性和可維護性的強大應用程式至關重要。透過遵循最佳實踐,這些技術也可以應用於其他程式語言。

1. 可變包裝

變數打包透過將多個資料項分組到單一結構中來最大限度地減少記憶體使用。在記憶體存取時間顯著影響效能的場景(例如大規模資料處理)中,此技術至關重要。當相關資料打包在一起時,可以更有效地使用 CPU 緩存,從而加快資料檢索速度。


import struct

# Packing two integers into a binary format
packed_data = struct.pack('ii', 10, 20)

# Unpacking the packed binary data
a, b = struct.unpack('ii', packed_data)

在這個範例中,使用 struct 模組將整數打包成緊湊的二進位格式,使資料處理更有效率。

2. 儲存與記憶體

理解儲存(磁碟)和記憶體(RAM)之間的區別至關重要。記憶體操作速度更快,但易失性,而儲存是持久的,但速度較慢。在效能關鍵型應用程式中,將頻繁存取的資料保留在記憶體中並最大限度地減少儲存 I/O 對於速度至關重要。


import mmap

# Memory-mapping a file
with open("data.txt", "r+b") as f:
    mmapped_file = mmap.mmap(f.fileno(), 0)


3. 固定長度變數與可變長度變數



import array

# Using fixed-length array for performance
fixed_array = array.array('i', [1, 2, 3, 4, 5])

# Dynamic list (variable-length)
dynamic_list = [1, 2, 3, 4, 5]

這裡,array.array 提供了一個固定長度的數組,提供比動態清單更可預測的效能。

4. 內部與公共職能



def _private_function(data):
    # Optimized for internal use, with minimal error handling
    return data ** 2

def public_function(data):
    # Includes additional checks for external use
    if isinstance(data, int):
        return _private_function(data)
    raise ValueError("Input must be an integer")


5. 功能修飾符



from functools import lru_cache

def compute_heavy_function(x):
    # A computationally expensive operation
    return x ** x

使用 lru_cache 作為裝飾器快取昂貴的函數呼叫的結果,透過避免冗餘計算來提高效能。

6. 使用圖書館

利用庫可以讓您避免重新發明輪子。像 NumPy 這樣的函式庫是用 C 語言編寫的,並且是為了效能而建構的,與純 Python 實作相比,它們對於繁重的數值計算來說更有效率。


import numpy as np

# Efficient matrix multiplication using NumPy
matrix_a = np.random.rand(1000, 1000)
matrix_b = np.random.rand(1000, 1000)
result = np.dot(matrix_a, matrix_b)

Here, NumPy's dot function is enhanced for matrix operations, far outperforming nested loops in pure Python.

7. Short-Circuiting Conditionals

Short-circuiting reduces unnecessary evaluations, which is particularly valuable in complex condition checks or when involving resource-intensive operations. It prevents execution of conditions that don't need to be checked, saving both time and computational power.
Since conditional checks will stop the second they find the first value which satisfies the condition, you should put the variables most likely to validate/invalidate the condition first. In OR conditions (or), try to put the variable with the highest likelihood of being true first, and in AND conditions (and), try to put the variable with the highest likelihood of being false first. As soon as that variable is checked, the conditional can exit without needing to check the other values.


def complex_condition(x, y):
    return x != 0 and y / x > 2  # Stops evaluation if x is 0

In this example, Python’s logical operators ensure that the division is only executed if x is non-zero, preventing potential runtime errors and unnecessary computation.

8. Free Up Memory

In long-running applications, especially those dealing with large datasets, it’s essential to free up memory once it’s no longer needed. This can be done using del, gc.collect(), or by allowing objects to go out of scope.


import gc

# Manual garbage collection to free up memory
large_data = [i for i in range(1000000)]
del large_data
gc.collect()  # Forces garbage collection

Using gc.collect() ensures that memory is reclaimed promptly, which is critical in memory-constrained environments.

9. Short Error Messages

In systems where memory or bandwidth is limited, such as embedded systems or logging in distributed applications, short error messages can reduce overhead. This practice also applies to scenarios where large-scale error logging is necessary.


    result = 10 / 0
except ZeroDivisionError:
    print("Err: Div/0")  # Short, concise error message

Short error messages are useful in environments where resource efficiency is crucial, such as IoT devices or high-frequency trading systems.

10. Optimize Loops

Loops are a common source of inefficiency, especially when processing large datasets. Optimising loops by reducing iterations, simplifying the logic, or using vectorised operations can significantly improve performance.


import numpy as np

# Vectorised operation with NumPy
array = np.array([1, 2, 3, 4, 5])

# Instead of looping through elements
result = array * 2  # Efficient, vectorised operation

Vectorisation eliminates the need for explicit loops, leveraging low-level optimisations for faster execution.

By applying these techniques, you can ensure your Python or other programming language programs run faster, use less memory, and are more scalable, which is especially important for applications in data science, web and systems programming.

PS: you can use https://perfpy.com/#/ to check python code efficiency.

