Home  >  Article  >  Backend Development  >  Efficient Python code

Efficient Python code

巴扎黑
巴扎黑Original
2017-04-05 15:45:551419browse

In my opinion, the python community is divided into three schools, namely the python 2.x organization, the 3.x organization and the PyPy organization. This classification basically boils down to library compatibility and speed. This article will focus on some common code optimization techniques and the significant performance improvement after compiling into C. Of course, I will also give the running time of the three main python genres. My goal is not to prove one is better than the other, just to give you an idea of ​​how to compare using these specific examples in different contexts.

Using generator

A commonly overlooked memory optimization is the use of generators. Generators allow us to create a function that returns only one record at a time instead of all records at once. If you are using python2.x, this is why you use xrange instead of range or ifilter instead of filter. A good example is creating a large list and stitching them together.

import timeit
import random

def generate(num):
while num:
yield random.randrange(10)
num -= 1

def create_list(num):
numbers = []
while num:
numbers.append(random.randrange(10))
num -= 1
return numbers
print(timeit.timeit("sum(generate(999))", setup="from __main__ import generate", number=1000))
>>> 0.88098192215 #Python 2.7
>>> 1.416813850402832 #Python 3.2
print(timeit.timeit("sum(create_list(999))", setup="from __main__ import create_list", number=1000))
>>> 0.924163103104 #Python 2.7
>>> 1.5026731491088867 #Python 3.2

Not only is this a little faster, it also prevents you from storing the entire list in memory!

Introduction to Ctypes

For key performance codes, python itself also provides us with an API to call C methods, mainly through ctypes. You can use ctypes without writing any C code. By default, python provides a precompiled standard c library. Let's go back to the generator example and see how much time it takes to implement it using ctypes.

import timeit
from ctypes import cdll

def generate_c(num):
#Load standard C library
libc = cdll.LoadLibrary("libc.so.6") #Linux
#libc = cdll.msvcrt #Windows
while num:
yield libc.rand() % 10
num -= 1

print(timeit.timeit("sum(generate_c(999))", setup="from __main__ import generate_c", number=1000))
>>> 0.434374809265 #Python 2.7
>>> 0.7084300518035889 #Python 3.2

Just replaced it with a random function of c, and the running time was reduced by more than half! Now if I tell you we can do better, would you believe it?

Introduction to Cython

Cython is a superset of python that allows us to call C functions and declare variables to improve performance. We need to install Cython before trying to use it.

sudo pip install cython

Cython is essentially a fork of another Pyrex-like library that is no longer under development. It compiles our Python-like code into a C library that we can call in a python file. Use the .pyx suffix instead of the .py suffix for your python files. Let's look at how to run our generator code using Cython.

#cython_generator.pyx
import random

def generate(num):
while num:
yield random.randrange(10)
num -= 1

We need to create a setup.py so that we can get Cython to compile our function.

from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext

setup(
cmdclass = {'build_ext': build_ext},
ext_modules = [Extension("generator", ["cython_generator.pyx"])]
)

Compiled using:

python setup.py build_ext --inplace

You should be able to see two files: cython_generator.c file and generator.so file. We use the following method to test our program:

import timeit
print(timeit.timeit("sum(generator.generate(999))", setup="import generator", number=1000))
>>> 0.835658073425

Not bad, let's see if there's anything we can improve on. We can first declare "num" as an integer, and then we can import the standard C library to be responsible for our random function.

#cython_generator.pyx
cdef extern from "stdlib.h":
int c_libc_rand "rand"()

def generate(int num):
while num:
yield c_libc_rand() % 10
num -= 1

If we compile and run again we will see this amazing number.

>>> 0.033586025238

Just a few changes brought decent results. However, sometimes this change is tedious, so let's see how to do it using regular python.

Introduction to PyPyPyPy is a just-in-time compiler for Python 2.7.3. In layman’s terms, this means making your code run faster. Quora uses PyPy in production. PyPy has some installation instructions on their download page, but if you are using Ubuntu, you can install it via apt-get. The way it works is out of the box, so no crazy bash or running scripts, just download and run. Let's see how our original generator code performs under PyPy.

import timeit
import random

def generate(num):
while num:
yield random.randrange(10)
num -= 1

def create_list(num):
numbers = []
while num:
numbers.append(random.randrange(10))
num -= 1
return numbers
print(timeit.timeit("sum(generate(999))", setup="from __main__ import generate", number=1000))
>>> 0.115154981613 #PyPy 1.9
>>> 0.118431091309 #PyPy 2.0b1
print(timeit.timeit("sum(create_list(999))", setup="from __main__ import create_list", number=1000))
>>> 0.140175104141 #PyPy 1.9
>>> 0.140514850616 #PyPy 2.0b1

Wow! Without modifying a single line of code, the running speed is 8 times faster than the pure python implementation.

Further testingWhy further research? PyPy is the champ! Not entirely true. Although most programs can run on PyPy, some libraries are not fully supported. Moreover, it is easier to write C extensions for your project than to change compilers. Let's dig a little deeper and see how ctypes allows us to write libraries in C. Let’s test the speed of merge sort and calculating the Fibonacci sequence. The following is the C code (functions.c) we will use:

/* functions.c */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/* http://rosettacode.org/wiki/Sorting_algorithms/Merge_sort#C */
inline void
merge (int *left, int l_len, int *right, int r_len, int *out)
{
int i, j, k;
for (i = j = k = 0; i < l_len && j < r_len;)
out[k++] = left[i] < right[j] ? left[i++] : right[j++];
while (i < l_len)
out[k++] = left[i++];
while (j < r_len)
out[k++] = right[j++];
}

/* inner recursion of merge sort */
void
recur (int *buf, int *tmp, int len)
{
int l = len / 2;
if (len <= 1)
return;
/* note that buf and tmp are swapped */
recur (tmp, buf, l);
recur (tmp + l, buf + l, len - l);
merge (tmp, l, tmp + l, len - l, buf);
}

/* preparation work before recursion */
void
merge_sort (int *buf, int len)
{
/* call alloc, copy and free only once */
int *tmp = malloc (sizeof (int) * len);
memcpy (tmp, buf, sizeof (int) * len);
recur (buf, tmp, len);
free (tmp);
}

int
fibRec (int n)
{
if (n < 2)
return n;
else
return fibRec (n - 1) + fibRec (n - 2);
}

On the Linux platform, we can compile it into a shared library using the following method:

gcc -Wall -fPIC -c functions.c
gcc -shared -o libfunctions.so functions.o

Using ctypes, you can use this library by loading the "libfunctions.so" shared library, just like we did with the standard C library earlier. Here we are going to compare the Python implementation and the C implementation. Now we start to calculate the Fibonacci sequence:

# functions.py

from ctypes import *
import time

libfunctions = cdll.LoadLibrary("./libfunctions.so")

def fibRec(n):
if n < 2:
return n
else:
return fibRec(n-1) + fibRec(n-2)

start = time.time()
fibRec(32)
finish = time.time()
print("Python: " + str(finish - start))

# C Fibonacci
start = time.time()
x = libfunctions.fibRec(32)
finish = time.time()
print("C: " + str(finish - start))

As we expected, C is faster than Python and PyPy. We can also compare merge sorts in the same way.

We haven't dug into the Cypes library yet, so these examples do not reflect the powerful side of python. The Cypes library has only a few standard type restrictions, such as int, char array, float, bytes, etc. By default, there is no integer array, however by multiplying with c_int (ctype is int type) we can obtain such an array indirectly. This is also what line 7 of the code is showing. We created a c_int array, an array of our numbers and packed them into a c_int array

  主要的是c语言不能这样做,而且你也不想。我们用指针来修改函数体。为了通过我们的c_numbers的数列,我们必须通过引用传递merge_sort功能。运行merge_sort后,我们利用c_numbers数组进行排序,我已经把下面的代码加到我的functions.py文件中了。

#Python Merge Sort
from random import shuffle, sample

#Generate 9999 random numbers between 0 and 100000
numbers = sample(range(100000), 9999)
shuffle(numbers)
c_numbers = (c_int * len(numbers))(*numbers)

from heapq import merge
def merge_sort(m):
if len(m) <= 1:
return m
middle = len(m) // 2
left = m[:middle]
right = m[middle:]
left = merge_sort(left)
right = merge_sort(right)
return list(merge(left, right))

start = time.time()
numbers = merge_sort(numbers)
finish = time.time()
print("Python: " + str(finish - start))

#C Merge Sort
start = time.time()
libfunctions.merge_sort(byref(c_numbers), len(numbers))
finish = time.time()
print("C: " + str(finish - start))
Python: 0.190635919571 #Python 2.7
Python: 0.11785483360290527 #Python 3.2
Python: 0.266992092133 #PyPy 1.9
Python: 0.265724897385 #PyPy 2.0b1
C: 0.00201296806335 #Python 2.7 + ctypes
C: 0.0019741058349609375 #Python 3.2 + ctypes
C: 0.0029308795929 #PyPy 1.9 + ctypes
C: 0.00287103652954 #PyPy 2.0b1 + ctypes

  这儿通过表格和图标来比较不同的结果。

Efficient Python code

  Merge Sort Fibonacci
Python 2.7 0.191 1.187
Python 2.7 + ctypes 0.002 0.044
Python 3.2 0.118 1.272
Python 3.2 + ctypes 0.002 0.046
PyPy 1.9 0.267 0.564
PyPy 1.9 + ctypes 0.003 0.048
PyPy 2.0b1 0.266 0.567
PyPy 2.0b1 + ctypes 0.003 0.046

The above is the detailed content of Efficient Python code. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn