search
HomeBackend DevelopmentPython TutorialThe execution process of a Python program includes converting source code into bytecode (i.e. compilation) and executing the bytecode

Question:

We have to write some Python programs every day, either to process some text, or to do some system management work. After the program is written, you only need to type the python command to start the program and start executing it:

$ python some-program.py

So, how is a .py file in text form converted step by step into something that can be executed by the CPU? What about machine instructions? In addition, .pyc files may be generated during program execution. What are the functions of these files?

1. Execution process

Although Python looks more like an interpreted language like Shell script in terms of behavior, in fact, the execution principle of Python program is essentially the same as that of Java or C# and can be summarized For virtual machine and bytecode. Python executes the program in two steps: first compile the program code into bytecode, and then start the virtual machine to execute the bytecode:

The execution process of a Python program includes converting source code into bytecode (i.e. compilation) and executing the bytecode

Although the Python command is also called the Python interpreter , but it is fundamentally different from other scripting language interpreters. In fact, the Python interpreter consists of compiler and virtual machine. When the Python interpreter is started, it mainly performs the following two steps:

The compiler compiles the Python source code in the .py file into bytecode. The virtual machine executes the bytecode generated by the compiler line by line.

Therefore, the Python statements in the .py file are not directly converted into machine instructions, but into Python bytecode.

2. Bytecode

The compiled result of the Python program is bytecode, which contains a lot of content related to the operation of Python. Therefore, whether it is to have a deeper understanding of the operating mechanism of the Python virtual machine or to optimize the operating efficiency of the Python program, bytecode is the key content. So, what does Python bytecode look like? How can we obtain the bytecode of a Python program? Python provides a built-in function compile for instant compilation of source code. We only need to call the compile function with the source code to be compiled as a parameter to obtain the compilation result of the source code.

3. Source code compilation

Below, we compile a program through the compile function:

The source code is saved in the demo.py file:

PI = 3.14

def circle_area(r):
    return PI * r ** 2

class Person(object):
    def __init__(self, name):
        self.name = name

    def say(self):
        print('i am', self.name)

Compile Previously, the source code needed to be read from the file:

>>> text = open('D:\myspace\code\pythonCode\mix\demo.py').read()
>>> print(text)
PI = 3.14

def circle_area(r):
    return PI * r ** 2

class Person(object):
    def __init__(self, name):
        self.name = name

    def say(self):
        print('i am', self.name)

Then call the compile function to compile the source code:

>>> result = compile(text,'D:\myspace\code\pythonCode\mix\demo.py', 'exec')

There are 3 required parameters for the compile function:

source : Source code to be compiled

filename: file name where the source code is located

mode: compilation mode, exec means compiling the source code as a module

Three compilation modes:

exec: used to compile module source code

single: used to compile a single Python statement (interactively)

eval: used to compile an eval expression

4. PyCodeObject

Through the compile function, we get the final source code compilation result result:

>>> result
<code object <module> at 0x000001DEC2FCF680, file "D:\myspace\code\pythonCode\mix\demo.py", line 1>
>>> result.__class__
<class &#39;code&#39;>

Finally we get a code type object, and its corresponding underlying structure is PyCodeObject

The source code of PyCodeObject is as follows:

/* Bytecode object */
struct PyCodeObject {
    PyObject_HEAD
    int co_argcount;            /* #arguments, except *args */
    int co_posonlyargcount;     /* #positional only arguments */
    int co_kwonlyargcount;      /* #keyword only arguments */
    int co_nlocals;             /* #local variables */
    int co_stacksize;           /* #entries needed for evaluation stack */
    int co_flags;               /* CO_..., see below */
    int co_firstlineno;         /* first source line number */
    PyObject *co_code;          /* instruction opcodes */
    PyObject *co_consts;        /* list (constants used) */
    PyObject *co_names;         /* list of strings (names used) */
    PyObject *co_varnames;      /* tuple of strings (local variable names) */
    PyObject *co_freevars;      /* tuple of strings (free variable names) */
    PyObject *co_cellvars;      /* tuple of strings (cell variable names) */
    /* The rest aren&#39;t used in either hash or comparisons, except for co_name,
       used in both. This is done to preserve the name and line number
       for tracebacks and debuggers; otherwise, constant de-duplication
       would collapse identical functions/lambdas defined on different lines.
    */
    Py_ssize_t *co_cell2arg;    /* Maps cell vars which are arguments. */
    PyObject *co_filename;      /* unicode (where it was loaded from) */
    PyObject *co_name;          /* unicode (name, for reference) */
    PyObject *co_linetable;     /* string (encoding addr<->lineno mapping) See
                                   Objects/lnotab_notes.txt for details. */
    void *co_zombieframe;       /* for optimization only (see frameobject.c) */
    PyObject *co_weakreflist;   /* to support weakrefs to code objects */
    /* Scratch space for extra data relating to the code object.
       Type is a void* to keep the format private in codeobject.c to force
       people to go through the proper APIs. */
    void *co_extra;

    /* Per opcodes just-in-time cache
     *
     * To reduce cache size, we use indirect mapping from opcode index to
     * cache object:
     *   cache = co_opcache[co_opcache_map[next_instr - first_instr] - 1]
     */

    // co_opcache_map is indexed by (next_instr - first_instr).
    //  * 0 means there is no cache for this opcode.
    //  * n > 0 means there is cache in co_opcache[n-1].
    unsigned char *co_opcache_map;
    _PyOpcache *co_opcache;
    int co_opcache_flag;  // used to determine when create a cache.
    unsigned char co_opcache_size;  // length of co_opcache.
};

The code object PyCodeObject is used to store the compilation results, including bytecodes and constants, names, etc. involved in the code. Key fields include:

##co_argcountNumber of parametersco_kwonlyargcountNumber of keyword parametersco_nlocalsPartial Number of variablesco_stacksizeStack space required to execute the codeco_flagsIdentificationco_firstlinenoThe first line number of the code blockco_codeInstruction operation code, that is, bytecode co_constsConstant listco_namesName listco_varnamesLocal variable name list

下面打印看一下这些字段对应的数据:

通过co_code字段获得字节码:

>>> result.co_code
b&#39;d\x00Z\x00d\x01d\x02\x84\x00Z\x01G\x00d\x03d\x04\x84\x00d\x04e\x02\x83\x03Z\x03d\x05S\x00&#39;

通过co_names字段获得代码对象涉及的所有名字:

>>> result.co_names
(&#39;PI&#39;, &#39;circle_area&#39;, &#39;object&#39;, &#39;Person&#39;)

通过co_consts字段获得代码对象涉及的所有常量:

>>> result.co_consts
(3.14, <code object circle_area at 0x0000023D04D3F310, file "D:\myspace\code\pythonCode\mix\demo.py", line 3>, &#39;circle_area&#39;, <code object Person at 0x0000023D04D3F5D0, file "D:\myspace\code\pythonCode\mix\demo.py", line 6>, &#39;Person&#39;, None)

可以看到,常量列表中还有两个代码对象,其中一个是circle_area函数体,另一个是Person类定义体。对应Python中作用域的划分方式,可以自然联想到:每个作用域对应一个代码对象。如果这个假设成立,那么Person代码对象的常量列表中应该还包括两个代码对象:init函数体和say函数体。下面取出Person类代码对象来看一下:

>>> person_code = result.co_consts[3]
>>> person_code
<code object Person at 0x0000023D04D3F5D0, file "D:\myspace\code\pythonCode\mix\demo.py", line 6>
>>> person_code.co_consts
(&#39;Person&#39;, <code object __init__ at 0x0000023D04D3F470, file "D:\myspace\code\pythonCode\mix\demo.py", line 7>, &#39;Person.__init__&#39;, <code object say at 0x0000023D04D3F520, file "D:\myspace\code\pythonCode\mix\demo.py", line 10>, &#39;Person.say&#39;, None)

因此,我们得出结论:Python源码编译后,每个作用域都对应着一个代码对象,子作用域代码对象位于父作用域代码对象的常量列表里,层级一一对应。

The execution process of a Python program includes converting source code into bytecode (i.e. compilation) and executing the bytecode

至此,我们对Python源码的编译结果——代码对象PyCodeObject有了最基本的认识,后续会在虚拟机、函数机制、类机制中进一步学习。

5. 反编译

字节码是一串不可读的字节序列,跟二进制机器码一样。如果想读懂机器码,可以将其反汇编,那么字节码可以反编译吗?

通过dis模块可以将字节码反编译:

>>> import dis
>>> dis.dis(result.co_code)
 0 LOAD_CONST               0 (0)
 2 STORE_NAME               0 (0)
 4 LOAD_CONST               1 (1)
 6 LOAD_CONST               2 (2)
 8 MAKE_FUNCTION            0
10 STORE_NAME               1 (1)
12 LOAD_BUILD_CLASS
14 LOAD_CONST               3 (3)
16 LOAD_CONST               4 (4)
18 MAKE_FUNCTION            0
20 LOAD_CONST               4 (4)
22 LOAD_NAME                2 (2)
24 CALL_FUNCTION            3
26 STORE_NAME               3 (3)
28 LOAD_CONST               5 (5)
30 RETURN_VALUE

字节码反编译后的结果和汇编语言很类似。其中,第一列是字节码的偏移量,第二列是指令,第三列是操作数。以第一条字节码为例,LOAD_CONST指令将常量加载进栈,常量下标由操作数给出,而下标为0的常量是:

>>> result.co_consts[0]3.14

这样,第一条字节码的意义就明确了:将常量3.14加载到栈。

由于代码对象保存了字节码、常量、名字等上下文信息,因此直接对代码对象进行反编译可以得到更清晰的结果:

>>>dis.dis(result)
  1           0 LOAD_CONST               0 (3.14)
              2 STORE_NAME               0 (PI)

  3           4 LOAD_CONST               1 (<code object circle_area at 0x0000023D04D3F310, file "D:\myspace\code\pythonCode\mix\demo.py", line 3>)
              6 LOAD_CONST               2 (&#39;circle_area&#39;)
              8 MAKE_FUNCTION            0
             10 STORE_NAME               1 (circle_area)

  6          12 LOAD_BUILD_CLASS
             14 LOAD_CONST               3 (<code object Person at 0x0000023D04D3F5D0, file "D:\myspace\code\pythonCode\mix\demo.py", line 6>)
             16 LOAD_CONST               4 (&#39;Person&#39;)
             18 MAKE_FUNCTION            0
             20 LOAD_CONST               4 (&#39;Person&#39;)
             22 LOAD_NAME                2 (object)
             24 CALL_FUNCTION            3
             26 STORE_NAME               3 (Person)
             28 LOAD_CONST               5 (None)
             30 RETURN_VALUE

Disassembly of <code object circle_area at 0x0000023D04D3F310, file "D:\myspace\code\pythonCode\mix\demo.py", line 3>:
  4           0 LOAD_GLOBAL              0 (PI)
              2 LOAD_FAST                0 (r)
              4 LOAD_CONST               1 (2)
              6 BINARY_POWER
              8 BINARY_MULTIPLY
             10 RETURN_VALUE

Disassembly of <code object Person at 0x0000023D04D3F5D0, file "D:\myspace\code\pythonCode\mix\demo.py", line 6>:
  6           0 LOAD_NAME                0 (__name__)
              2 STORE_NAME               1 (__module__)
              4 LOAD_CONST               0 (&#39;Person&#39;)
              6 STORE_NAME               2 (__qualname__)

  7           8 LOAD_CONST               1 (<code object __init__ at 0x0000023D04D3F470, file "D:\myspace\code\pythonCode\mix\demo.py", line 7>)
             10 LOAD_CONST               2 (&#39;Person.__init__&#39;)
             12 MAKE_FUNCTION            0
             14 STORE_NAME               3 (__init__)

 10          16 LOAD_CONST               3 (<code object say at 0x0000023D04D3F520, file "D:\myspace\code\pythonCode\mix\demo.py", line 10>)
             18 LOAD_CONST               4 (&#39;Person.say&#39;)
             20 MAKE_FUNCTION            0
             22 STORE_NAME               4 (say)
             24 LOAD_CONST               5 (None)
             26 RETURN_VALUE

Disassembly of <code object __init__ at 0x0000023D04D3F470, file "D:\myspace\code\pythonCode\mix\demo.py", line 7>:
  8           0 LOAD_FAST                1 (name)
              2 LOAD_FAST                0 (self)
              4 STORE_ATTR               0 (name)
              6 LOAD_CONST               0 (None)
              8 RETURN_VALUE

Disassembly of <code object say at 0x0000023D04D3F520, file "D:\myspace\code\pythonCode\mix\demo.py", line 10>:
 11           0 LOAD_GLOBAL              0 (print)
              2 LOAD_CONST               1 (&#39;i am&#39;)
              4 LOAD_FAST                0 (self)
              6 LOAD_ATTR                1 (name)
              8 CALL_FUNCTION            2
             10 POP_TOP
             12 LOAD_CONST               0 (None)
             14 RETURN_VALUE

操作数指定的常量或名字的实际值在旁边的括号内列出,此外,字节码以语句为单位进行了分组,中间以空行隔开,语句的行号在字节码前面给出。例如PI = 3.14这个语句就被会变成了两条字节码:

  1           0 LOAD_CONST               0 (3.14)
              2 STORE_NAME               0 (PI)

6. pyc

如果将demo作为模块导入,Python将在demo.py文件所在目录下生成.pyc文件:

>>> import demo

The execution process of a Python program includes converting source code into bytecode (i.e. compilation) and executing the bytecode

pyc文件会保存经过序列化处理的代码对象PyCodeObject。这样一来,Python后续导入demo模块时,直接读取pyc文件并反序列化即可得到代码对象,避免了重复编译导致的开销。只有demo.py有新修改(时间戳比.pyc文件新),Python才会重新编译。

因此,对比Java而言:Python中的.py文件可以类比Java中的.java文件,都是源码文件;而.pyc文件可以类比.class文件,都是编译结果。只不过Java程序需要先用编译器javac命令来编译,再用虚拟机java命令来执行;而Python解释器把这两个过程都完成了。

Field Purpose

The above is the detailed content of The execution process of a Python program includes converting source code into bytecode (i.e. compilation) and executing the bytecode. For more information, please follow other related articles on the PHP Chinese website!

Statement
This article is reproduced at:亿速云. If there is any infringement, please contact admin@php.cn delete
Python vs. C  : Learning Curves and Ease of UsePython vs. C : Learning Curves and Ease of UseApr 19, 2025 am 12:20 AM

Python is easier to learn and use, while C is more powerful but complex. 1. Python syntax is concise and suitable for beginners. Dynamic typing and automatic memory management make it easy to use, but may cause runtime errors. 2.C provides low-level control and advanced features, suitable for high-performance applications, but has a high learning threshold and requires manual memory and type safety management.

Python vs. C  : Memory Management and ControlPython vs. C : Memory Management and ControlApr 19, 2025 am 12:17 AM

Python and C have significant differences in memory management and control. 1. Python uses automatic memory management, based on reference counting and garbage collection, simplifying the work of programmers. 2.C requires manual management of memory, providing more control but increasing complexity and error risk. Which language to choose should be based on project requirements and team technology stack.

Python for Scientific Computing: A Detailed LookPython for Scientific Computing: A Detailed LookApr 19, 2025 am 12:15 AM

Python's applications in scientific computing include data analysis, machine learning, numerical simulation and visualization. 1.Numpy provides efficient multi-dimensional arrays and mathematical functions. 2. SciPy extends Numpy functionality and provides optimization and linear algebra tools. 3. Pandas is used for data processing and analysis. 4.Matplotlib is used to generate various graphs and visual results.

Python and C  : Finding the Right ToolPython and C : Finding the Right ToolApr 19, 2025 am 12:04 AM

Whether to choose Python or C depends on project requirements: 1) Python is suitable for rapid development, data science, and scripting because of its concise syntax and rich libraries; 2) C is suitable for scenarios that require high performance and underlying control, such as system programming and game development, because of its compilation and manual memory management.

Python for Data Science and Machine LearningPython for Data Science and Machine LearningApr 19, 2025 am 12:02 AM

Python is widely used in data science and machine learning, mainly relying on its simplicity and a powerful library ecosystem. 1) Pandas is used for data processing and analysis, 2) Numpy provides efficient numerical calculations, and 3) Scikit-learn is used for machine learning model construction and optimization, these libraries make Python an ideal tool for data science and machine learning.

Learning Python: Is 2 Hours of Daily Study Sufficient?Learning Python: Is 2 Hours of Daily Study Sufficient?Apr 18, 2025 am 12:22 AM

Is it enough to learn Python for two hours a day? It depends on your goals and learning methods. 1) Develop a clear learning plan, 2) Select appropriate learning resources and methods, 3) Practice and review and consolidate hands-on practice and review and consolidate, and you can gradually master the basic knowledge and advanced functions of Python during this period.

Python for Web Development: Key ApplicationsPython for Web Development: Key ApplicationsApr 18, 2025 am 12:20 AM

Key applications of Python in web development include the use of Django and Flask frameworks, API development, data analysis and visualization, machine learning and AI, and performance optimization. 1. Django and Flask framework: Django is suitable for rapid development of complex applications, and Flask is suitable for small or highly customized projects. 2. API development: Use Flask or DjangoRESTFramework to build RESTfulAPI. 3. Data analysis and visualization: Use Python to process data and display it through the web interface. 4. Machine Learning and AI: Python is used to build intelligent web applications. 5. Performance optimization: optimized through asynchronous programming, caching and code

Python vs. C  : Exploring Performance and EfficiencyPython vs. C : Exploring Performance and EfficiencyApr 18, 2025 am 12:20 AM

Python is better than C in development efficiency, but C is higher in execution performance. 1. Python's concise syntax and rich libraries improve development efficiency. 2.C's compilation-type characteristics and hardware control improve execution performance. When making a choice, you need to weigh the development speed and execution efficiency based on project needs.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Atom editor mac version download

Atom editor mac version download

The most popular open source editor

SublimeText3 Linux new version

SublimeText3 Linux new version

SublimeText3 Linux latest version

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

SublimeText3 English version

SublimeText3 English version

Recommended: Win version, supports code prompts!

SAP NetWeaver Server Adapter for Eclipse

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.