20 tips to make your Python fly-Python Tutorial-php.cn

Home

Backend Development

Python Tutorial

20 tips to make your Python fly

高洛峰

Feb 24, 2017 pm 03:34 PM

The article shared today has not much text but mainly code. Absolutely informative and straightforward, it mainly shares 20 tips for improving Python performance and teaches you how to say goodbye to slow Python. Original author Kaiyuan, a full-stack programmer, uses Python, Java, PHP and C++.

1. Optimize algorithm time complexity

The time complexity of the algorithm has the greatest impact on the execution efficiency of the program. In Python, you can select appropriate data Structure to optimize time complexity. For example, the time complexity of searching for a certain element in list and set is O(n) and O(1) respectively. Different scenarios have different optimization methods. Generally speaking, there are generally ideas such as divide and conquer, branch and bound, greedy, and dynamic programming.

2. Reduce redundant data

For example, use upper triangle or lower triangle to save a large symmetric matrix. Use sparse matrix representation in matrices where 0 elements account for the majority.

3. Reasonable use of copy and deepcopy

For objects of data structures such as dict and list, direct assignment uses reference. In some cases, you need to copy the entire object. In this case, you can use copy and deepcopy in the copy package. The difference between these two functions is that the latter copies recursively. The efficiency is also different: (The following program is run in ipython)

import copy
a = range(100000)
%timeit -n 10 copy.copy(a) # 运行10次 copy.copy(a)

%timeit -n 10 copy.deepcopy(a)
10 loops, best of 3: 1.55 ms per loop
10 loops, best of 3: 151 ms per loop

The -n after timeit indicates the number of runs, and the last two lines correspond to two The output of timeit is the same below. It can be seen that the latter is an order of magnitude slower.

4. Use dict or set to find elements

Python dict and set are implemented using hash tables (similar to unordered_map in the c++11 standard library), search The time complexity of elements is O(1).

a = range(1000)
s = set(a)
d = dict((i,1) for i in a)
%timeit -n 10000 100 in d
%timeit -n 10000 100 in s
10000 loops, best of 3: 43.5 ns per loop
10000 loops, best of 3: 49.6 ns per loop

dict is slightly more efficient (and takes up more space).

5. Proper use of generators and yield

%timeit -n 100 a = (i for i in range(100000))
%timeit -n 100 b = [i for i in range(100000)]
100 loops, best of 3: 1.54 ms per loop
100 loops, best of 3: 4.56 ms per loop

Use () to get a generator Object, the memory space required has nothing to do with the size of the list, so the efficiency will be higher. In specific applications, for example, set(i for i in range(100000)) will be faster than set([i for i in range(100000)]).

But for situations where loop traversal is required:

%timeit -n 10 for x in (i for i in range(100000)): pass
%timeit -n 10 for x in [i for i in range(100000)]: pass

10 loops, best of 3: 6.51 ms per loop
10 loops, best of 3: 5.54 ms per loop

The latter is more efficient, but if there is a break in the loop, use generator The benefits are obvious. Yield is also used to create generators:

def yield_func(ls):
  for i in ls:
    yield i+1

def not_yield_func(ls):
  return [i+1 for i in ls]

ls = range(1000000)
%timeit -n 10 for i in yield_func(ls):pass

%timeit -n 10 for i in not_yield_func(ls):pass

10 loops, best of 3: 63.8 ms per loop
10 loops, best of 3: 62.9 ms per loop

For lists that are not very large in memory, you can directly return a list, but the readability of yield is better (personally preferences).
Python2.x’s built-in generator functions include the xrange function, itertools package, etc.

6. Optimize loops

Do not put things that can be done outside the loop inside the loop. For example, the following optimization can be twice as fast:

a = range(10000)
size_a = len(a)
%timeit -n 1000 for i in a: k = len(a)
%timeit -n 1000 for i in a: k = size_a
1000 loops, best of 3: 569 µs per loop
1000 loops, best of 3: 256 µs per loop

7. Optimize the order of multiple judgment expressions

For and, the one with the least satisfying conditions should be placed first. or, put the ones that meet the most conditions first. For example:

a = range(2000) 
%timeit -n 100 [i for i in a if 10 < i < 20 or 1000 < i < 2000]
%timeit -n 100 [i for i in a if 1000 < i < 2000 or 100 < i < 20]   
%timeit -n 100 [i for i in a if i % 2 == 0 and i > 1900]
%timeit -n 100 [i for i in a if i > 1900 and i % 2 == 0]
100 loops, best of 3: 287 µs per loop
100 loops, best of 3: 214 µs per loop
100 loops, best of 3: 128 µs per loop
100 loops, best of 3: 56.1 µs per loop

8. Use join to merge the strings in the iterator

In [1]: %%timeit
  ...: s = &#39;&#39;
  ...: for i in a:
  ...:     s += i
  ...:
10000 loops, best of 3: 59.8 µs per loop

In [2]: %%timeit
s = &#39;&#39;.join(a)
  ...:
100000 loops, best of 3: 11.8 µs per loop

join has about 5 times improvement in the accumulation method.

9. Choose the appropriate formatting character method

s1, s2 = &#39;ax&#39;, &#39;bx&#39;

%timeit -n 100000 &#39;abc%s%s&#39; % (s1, s2)
%timeit -n 100000 &#39;abc{0}{1}&#39;.format(s1, s2)
%timeit -n 100000 &#39;abc&#39; + s1 + s2
100000 loops, best of 3: 183 ns per loop
100000 loops, best of 3: 169 ns per loop
100000 loops, best of 3: 103 ns per loop

Among the three situations, the % method is the most suitable Slow, but the difference between the three is not big (all are very fast). (Personally, I think % is the most readable)

10. Exchange the values of two variables without using intermediate variables

In [3]: %%timeit -n 10000
  a,b=1,2
  ....: c=a;a=b;b=c;
  ....:
10000 loops, best of 3: 172 ns per loop

In [4]: %%timeit -n 10000

a,b=1,2a,b=b,a
  ....:
10000 loops, best of 3: 86 ns per loop

Use a,b=b,a instead of c=a;a=b;b=c; to exchange the values of a, b, which can be more than 1 times faster.

11. Using if is

a = range(10000)
%timeit -n 100 [i for i in a if i == True]
%timeit -n 100 [i for i in a if i is True]
100 loops, best of 3: 531 µs per loop
100 loops, best of 3: 362 µs per loop

Using if is True is nearly twice as fast as if == True.

12. Use cascade comparison x

x, y, z = 1,2,3

%timeit -n 1000000 if x < y < z:pass
%timeit -n 1000000 if x < y and y < z:pass

1000000 loops, best of 3: 101 ns per loop
1000000 loops, best of 3: 121 ns per loop

13. while 1 is faster than while True

def while_1():
  n = 100000
  while 1:
    n -= 1
    if n <= 0: break

def while_true():
  n = 100000
  while True:
    n -= 1
    if n <= 0: break

m, n = 1000000, 1000000

%timeit -n 100 while_1()
%timeit -n 100 while_true()
100 loops, best of 3: 3.69 ms per loop
100 loops, best of 3: 5.61 ms per loop

while 1 is much faster than while true, the reason is In python2.x, True is a global variable, not a keyword.

14. Using ** instead of pow

%timeit -n 10000 c = pow(2,20)
%timeit -n 10000 c = 2**20

10000 loops, best of 3: 284 ns per loop
10000 loops, best of 3: 16.9 ns per loop

** is more than 10 times faster!

15. Use cProfile, cStringIO and cPickle to implement the same function (respectively corresponding to profile, StringIO, pickle) packages

import cPickle
import pickle
a = range(10000)
%timeit -n 100 x = cPickle.dumps(a)
%timeit -n 100 x = pickle.dumps(a)
100 loops, best of 3: 1.58 ms per loop
100 loops, best of 3: 17 ms per loop

The package implemented by c is more than 10 times faster!

16. Use the best deserialization method

The following compares the efficiency of eval, cPickle, and json methods for deserializing corresponding strings:

import json
import cPickle
a = range(10000)
s1 = str(a)
s2 = cPickle.dumps(a)
s3 = json.dumps(a)
%timeit -n 100 x = eval(s1)
%timeit -n 100 x = cPickle.loads(s2)
%timeit -n 100 x = json.loads(s3)
100 loops, best of 3: 16.8 ms per loop
100 loops, best of 3: 2.02 ms per loop
100 loops, best of 3: 798 µs per loop

It can be seen that json is nearly 3 times faster than cPickle and more than 20 times faster than eval.

17. Use C extension (Extension)

Currently there are three main methods: CPython (the most common implementation method of python) native API, ctypes, Cython, and cffi , their function is to enable Python programs to call dynamic link libraries compiled from C. Their characteristics are:

CPython native API: By introducing the Python.h header file, Python data structures can be used directly in the corresponding C program. The implementation process is relatively cumbersome, but it has a relatively large scope of application.
ctypes: Usually used to wrap (wrap) C programs, allowing pure Python programs to call functions in dynamic link libraries (dll in Windows or so files in Unix) . If you want to use an existing C library in python, using ctypes is a good choice. According to some benchmark tests, python2+ctypes is the best performance method.
Cython: Cython is a superset of CPython that simplifies the process of writing C extensions. The advantage of Cython is that its syntax is concise and it is well compatible with libraries such as numpy that contain a large number of C extensions. Cython's enabling scenarios are generally aimed at the optimization of a certain algorithm or process in the project. In some tests, performance improvements can be hundreds of times greater.
cffi: cffi is the implementation of ctypes in pypy (see below for details), and is also compatible with CPython. cffi provides a way to use C class libraries in python. You can write C code directly in python code and support linking to existing C class libraries.
The use of these optimization methods is generally aimed at optimizing the performance bottleneck modules of existing projects, which can greatly improve the operating efficiency of the entire program with a small amount of changes to the original project.

18. Parallel Programming

Because of the existence of GIL, it is difficult for Python to take full advantage of multi-core CPUs. However, the following parallel modes can be achieved through the built-in module multiprocessing:

Multiple processes: For CPU-intensive programs, you can use multiprocessing's Process and Pool Wait for the encapsulated classes to implement parallel computing through multiple processes. However, because the communication cost in the process is relatively large, the efficiency of programs that require a large amount of data interaction between processes may not be greatly improved.
Multi-threading: For IO-intensive programs, the multiprocessing.dummy module uses the multiprocessing interface to encapsulate threading, making multi-threaded programming very easy (for example, you can use Pool map interface, simple and efficient).
Distributed: The Managers class in multiprocessing provides a way to share data between different processes, and distributed programs can be developed on this basis.
In different business scenarios, you can choose one or a combination of several to optimize program performance.

19. The ultimate killer: PyPy

PyPy is Python implemented using RPython (a subset of CPython). According to the benchmark test data on the official website, it is better than CPython's implementation of Python is more than 6 times faster. The reason for the speed is the use of a Just-in-Time (JIT) compiler, that is, a dynamic compiler. Different from static compilers (such as gcc, javac, etc.), it uses data from the running process of the program for optimization. Due to historical reasons, the GIL is still retained in pypy, but the ongoing STM project is trying to turn PyPy into Python without the GIL.

If the python program contains C extensions (non-cffi method), the optimization effect of JIT will be greatly reduced, even slower than CPython (than Numpy). So in PyPy it's better to use pure Python or use the cffi extension.

With the improvement of STM, Numpy and other projects, I believe that PyPy will replace CPython.

20. Use performance analysis tools

In addition to the timeit module used in ipython above, there is also cProfile. The use of cProfile is also very simple: python -m cProfile filename.py. filename.py is the file name of the program to be run. You can see the number of times each function is called and the running time in the standard output to find the program's Performance bottlenecks can then be optimized in a targeted manner.

The above is the entire content of this article. I hope it will be helpful to everyone's learning. I also hope that everyone will support the PHP Chinese website.

For more 20 tips to make your Python fly and related articles, please pay attention to the PHP Chinese website!

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Python's Hybrid Approach: Compilation and Interpretation CombinedMay 08, 2025 am 12:16 AM

Pythonusesahybridapproach,combiningcompilationtobytecodeandinterpretation.1)Codeiscompiledtoplatform-independentbytecode.2)BytecodeisinterpretedbythePythonVirtualMachine,enhancingefficiencyandportability.

Learn the Differences Between Python's 'for' and 'while' LoopsMay 08, 2025 am 12:11 AM

ThekeydifferencesbetweenPython's"for"and"while"loopsare:1)"For"loopsareidealforiteratingoversequencesorknowniterations,while2)"while"loopsarebetterforcontinuinguntilaconditionismetwithoutpredefinediterations.Un

Python concatenate lists with duplicatesMay 08, 2025 am 12:09 AM

In Python, you can connect lists and manage duplicate elements through a variety of methods: 1) Use operators or extend() to retain all duplicate elements; 2) Convert to sets and then return to lists to remove all duplicate elements, but the original order will be lost; 3) Use loops or list comprehensions to combine sets to remove duplicate elements and maintain the original order.

Python List Concatenation Performance: Speed ComparisonMay 08, 2025 am 12:09 AM

ThefastestmethodforlistconcatenationinPythondependsonlistsize:1)Forsmalllists,the operatorisefficient.2)Forlargerlists,list.extend()orlistcomprehensionisfaster,withextend()beingmorememory-efficientbymodifyinglistsin-place.

How do you insert elements into a Python list?May 08, 2025 am 12:07 AM

ToinsertelementsintoaPythonlist,useappend()toaddtotheend,insert()foraspecificposition,andextend()formultipleelements.1)Useappend()foraddingsingleitemstotheend.2)Useinsert()toaddataspecificindex,thoughit'sslowerforlargelists.3)Useextend()toaddmultiple

Are Python lists dynamic arrays or linked lists under the hood?May 07, 2025 am 12:16 AM

Pythonlistsareimplementedasdynamicarrays,notlinkedlists.1)Theyarestoredincontiguousmemoryblocks,whichmayrequirereallocationwhenappendingitems,impactingperformance.2)Linkedlistswouldofferefficientinsertions/deletionsbutslowerindexedaccess,leadingPytho

How do you remove elements from a Python list?May 07, 2025 am 12:15 AM

Pythonoffersfourmainmethodstoremoveelementsfromalist:1)remove(value)removesthefirstoccurrenceofavalue,2)pop(index)removesandreturnsanelementataspecifiedindex,3)delstatementremoveselementsbyindexorslice,and4)clear()removesallitemsfromthelist.Eachmetho

What should you check if you get a 'Permission denied' error when trying to run a script?May 07, 2025 am 12:12 AM

Toresolvea"Permissiondenied"errorwhenrunningascript,followthesesteps:1)Checkandadjustthescript'spermissionsusingchmod xmyscript.shtomakeitexecutable.2)Ensurethescriptislocatedinadirectorywhereyouhavewritepermissions,suchasyourhomedirectory.

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

How to fix KB5055523 fails to install in Windows 11?

3 weeks agoByDDD

How to fix KB5055518 fails to install in Windows 10?

3 weeks agoByDDD

Roblox: Grow A Garden - Complete Mutation Guide

2 weeks agoByDDD

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

How to fix KB5055612 fails to install in Windows 10?

3 weeks agoByDDD

Hot Tools

SublimeText3 Linux new version

SublimeText3 Linux latest version

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

Notepad++7.3.1

Easy-to-use and free code editor

MantisBT

Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

Hot Topics

1662

1419

1313

1262

1236