PYC file
The pyc file is a bytecode file generated by Python when interpreting and executing the source code. It contains the compilation results of the source code. and related metadata information so that Python can load and execute code faster.
Different from compiled languages, Python is an interpreted language and does not directly compile source code into machine code and execute it. Before running the code, the Python interpreter will first compile the source code into bytecode, and then interpret the bytecode for execution. The .pyc file is the bytecode file generated during this process.
When the Python interpreter executes a .py file for the first time, it will generate a corresponding .pyc file in the same directory so that it can be executed faster the next time the file is loaded. When the source file is modified and reloaded, the interpreter regenerates the .pyc file to update the cached bytecode.
Generate PYC file
Normal python files need to be turned into bytecode through the compiler, and then the bytecode is handed over to the python virtual machine, and then the python virtual machine executes the bytecode. The overall process is as follows:
#We can directly use the compile all module to generate the pyc file of the corresponding file.
➜ pvm ls demo.py hello.py ➜ pvm python -m compileall . Listing '.'... Listing './.idea'... Listing './.idea/inspectionProfiles'... Compiling './demo.py'... Compiling './hello.py'... ➜ pvm ls __pycache__ demo.py hello.py ➜ pvm ls __pycache__ demo.cpython-310.pyc hello.cpython-310.pyc
python -m compileall .
The command will recursively scan the py files in the current directory and generate the pyc file of the corresponding file.
PYC file layout
def f(): x = 1 return 2The hexadecimal form of the pyc file is as follows:
➜ __pycache__ hexdump -C hello.cpython-310.pyc 00000000 6f 0d 0d 0a 00 00 00 00 b9 48 21 64 20 00 00 00 |o........H!d ...| 00000010 e3 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00000020 00 02 00 00 00 40 00 00 00 73 0c 00 00 00 64 00 |.....@...s....d.| 00000030 64 01 84 00 5a 00 64 02 53 00 29 03 63 00 00 00 |d...Z.d.S.).c...| 00000040 00 00 00 00 00 00 00 00 00 01 00 00 00 01 00 00 |................| 00000050 00 43 00 00 00 73 08 00 00 00 64 01 7d 00 64 02 |.C...s....d.}.d.| 00000060 53 00 29 03 4e e9 01 00 00 00 e9 02 00 00 00 a9 |S.).N...........| 00000070 00 29 01 da 01 78 72 03 00 00 00 72 03 00 00 00 |.)...xr....r....| 00000080 fa 0a 2e 2f 68 65 6c 6c 6f 2e 70 79 da 01 66 01 |.../hello.py..f.| 00000090 00 00 00 73 04 00 00 00 04 01 04 01 72 06 00 00 |...s........r...| 000000a0 00 4e 29 01 72 06 00 00 00 72 03 00 00 00 72 03 |.N).r....r....r.| 000000b0 00 00 00 72 03 00 00 00 72 05 00 00 00 da 08 3c |...r....r......<| 000000c0 6d 6f 64 75 6c 65 3e 01 00 00 00 73 02 00 00 00 |module>....s....| 000000d0 0c 00 |..| 000000d2Because of data usage Little endian representation, so for the above data:
- The first part of the magic number is: 0xa0d0d6f.
- The second part Bit Field is: 0x0.
- The last modification date of the third part is: 0x642148b9.
- The file size of the fourth part is: 0x20 bytes, which means the size of the hello.py file is 32 bytes.
import struct import time import binascii fname = "./__pycache__/hello.cpython-310.pyc" f = open(fname, "rb") magic = struct.unpack('<l', f.read(4))[0] bit_filed = f.read(4) print(f"bit field = {binascii.hexlify(bit_filed)}") moddate = f.read(4) filesz = f.read(4) modtime = time.asctime(time.localtime(struct.unpack('<l', moddate)[0])) filesz = struct.unpack('<L', filesz) print("magic %s" % (hex(magic))) print("moddate (%s)" % (modtime)) print("File Size %d" % filesz) f.close()The output of the above code is as follows:
bit field = b'00000000'About pyc For detailed file operations, please view the source code of the python standard library importlib/_bootstrap_external.py file. CODEOBJECTIn CPython,magic 0xa0d0d6f
moddate (Mon Mar 27 15:41:45 2023)
File Size 32
CodeObject is an object, which contains the bytecode, constants, variables, positional parameters, keyword parameters and other information of the Python code , and some metadata used to run the code, such as file name, code line number, etc.
CodeObject and then execute it. During compilation, the interpreter converts Python code into bytecode and saves it in a
CodeObject object. Thereafter, whenever we call that module or function, the interpreter will use the bytecode in
CodeObject to execute the code.
CodeObject Objects are immutable and cannot be modified once created. This is because the bytecode of Python code is immutable, and the
CodeObject object contains these bytecodes, so it is also immutable.
现在举一个例子来分析一下 pycdemo.py 的 pyc 文件,pycdemo.py 的源程序如下所示:
if __name__ == '__main__': a = 100 print(a)
下面的代码是一个用于加载 pycdemo01.cpython-39.pyc 文件(也就是 hello.py 对应的 pyc 文件)的代码,使用 marshal 读取 pyc 文件里面的 code object 。
import marshal import dis import struct import time import types import binascii def print_metadata(fp): magic = struct.unpack('<l', fp.read(4))[0] print(f"magic number = {hex(magic)}") bit_field = struct.unpack('<l', fp.read(4))[0] print(f"bit filed = {bit_field}") t = struct.unpack('<l', fp.read(4))[0] print(f"time = {time.asctime(time.localtime(t))}") file_size = struct.unpack('<l', fp.read(4))[0] print(f"file size = {file_size}") def show_code(code, indent=''): print ("%scode" % indent) indent += ' ' print ("%sargcount %d" % (indent, code.co_argcount)) print ("%snlocals %d" % (indent, code.co_nlocals)) print ("%sstacksize %d" % (indent, code.co_stacksize)) print ("%sflags %04x" % (indent, code.co_flags)) show_hex("code", code.co_code, indent=indent) dis.disassemble(code) print ("%sconsts" % indent) for const in code.co_consts: if type(const) == types.CodeType: show_code(const, indent+' ') else: print(" %s%r" % (indent, const)) print("%snames %r" % (indent, code.co_names)) print("%svarnames %r" % (indent, code.co_varnames)) print("%sfreevars %r" % (indent, code.co_freevars)) print("%scellvars %r" % (indent, code.co_cellvars)) print("%sfilename %r" % (indent, code.co_filename)) print("%sname %r" % (indent, code.co_name)) print("%sfirstlineno %d" % (indent, code.co_firstlineno)) show_hex("lnotab", code.co_lnotab, indent=indent) def show_hex(label, h, indent): h = binascii.hexlify(h) if len(h) < 60: print("%s%s %s" % (indent, label, h)) else: print("%s%s" % (indent, label)) for i in range(0, len(h), 60): print("%s %s" % (indent, h[i:i+60])) if __name__ == '__main__': filename = "./__pycache__/pycdemo01.cpython-39.pyc" with open(filename, "rb") as fp: print_metadata(fp) code_object = marshal.load(fp) show_code(code_object)
执行上面的程序输出结果如下所示:
magic number = 0xa0d0d61 bit filed = 0 time = Tue Mar 28 02:40:20 2023 file size = 54 code argcount 0 nlocals 0 stacksize 2 flags 0040 code b'650064006b02721464015a01650265018301010064025300' 3 0 LOAD_NAME 0 (__name__) 2 LOAD_CONST 0 ('__main__') 4 COMPARE_OP 2 (==) 6 POP_JUMP_IF_FALSE 20 4 8 LOAD_CONST 1 (100) 10 STORE_NAME 1 (a) 5 12 LOAD_NAME 2 (print) 14 LOAD_NAME 1 (a) 16 CALL_FUNCTION 1 18 POP_TOP >> 20 LOAD_CONST 2 (None) 22 RETURN_VALUE consts '__main__' 100 None names ('__name__', 'a', 'print') varnames () freevars () cellvars () filename './pycdemo01.py' name '<module>' firstlineno 3 lnotab b'08010401'
下面是 code object 当中各个字段的作用:
首先需要了解一下代码块这个概念,所谓代码块就是一个小的 python 代码,被当做一个小的单元整体执行。在 Python 中常见的代码块包括函数体、类的定义和模块。
argcount,这个表示一个代码块的参数个数,这个参数只对函数体代码块有用,因为函数可能会有参数,比如上面的 pycdemo.py 是一个模块而不是一个函数,因此这个参数对应的值为 0 。
co_code,这个对象的具体内容就是一个字节序列,存储真实的 python 字节码,主要是用于 python 虚拟机执行的,在本篇文章当中暂时不详细分析。
co_consts,这个字段是一个列表类型的字段,主要是包含一些字符串常量和数值常量,比如上面的 ";main" 和 100 。
co_filename,这个字段的含义就是对应的源文件的文件名。
co_firstlineno,这个字段的含义为在 python 源文件当中第一行代码出现的行数,这个字段在进行调试的时候非常重要。
主要含义是标识该 code object 的类型的字段是 co_flags。0x0080 表示这个 block 是一个协程,0x0010 表示这个 code object 是嵌套的等等。
co_lnotab,这个字段的含义主要是用于计算每个字节码指令对应的源代码行数。
The main purpose of the field "co_varnames" is to indicate a name defined locally in a code object.。
co_names,和 co_varnames 相反,表示非本地定义但是在 code object 当中使用的名字。
co_nlocals,这个字段表示在一个 code object 当中本地使用的变量个数。
co_stackszie,因为 python 虚拟机是一个栈式计算机,这个参数的值表示这个栈需要的最大的值。
co_cellvars,co_freevars,这两个字段主要和嵌套函数和函数闭包有关。
The above is the detailed content of What is the pyc file structure of the python virtual machine?. For more information, please follow other related articles on the PHP Chinese website!

Python and C each have their own advantages, and the choice should be based on project requirements. 1) Python is suitable for rapid development and data processing due to its concise syntax and dynamic typing. 2)C is suitable for high performance and system programming due to its static typing and manual memory management.

Choosing Python or C depends on project requirements: 1) If you need rapid development, data processing and prototype design, choose Python; 2) If you need high performance, low latency and close hardware control, choose C.

By investing 2 hours of Python learning every day, you can effectively improve your programming skills. 1. Learn new knowledge: read documents or watch tutorials. 2. Practice: Write code and complete exercises. 3. Review: Consolidate the content you have learned. 4. Project practice: Apply what you have learned in actual projects. Such a structured learning plan can help you systematically master Python and achieve career goals.

Methods to learn Python efficiently within two hours include: 1. Review the basic knowledge and ensure that you are familiar with Python installation and basic syntax; 2. Understand the core concepts of Python, such as variables, lists, functions, etc.; 3. Master basic and advanced usage by using examples; 4. Learn common errors and debugging techniques; 5. Apply performance optimization and best practices, such as using list comprehensions and following the PEP8 style guide.

Python is suitable for beginners and data science, and C is suitable for system programming and game development. 1. Python is simple and easy to use, suitable for data science and web development. 2.C provides high performance and control, suitable for game development and system programming. The choice should be based on project needs and personal interests.

Python is more suitable for data science and rapid development, while C is more suitable for high performance and system programming. 1. Python syntax is concise and easy to learn, suitable for data processing and scientific computing. 2.C has complex syntax but excellent performance and is often used in game development and system programming.

It is feasible to invest two hours a day to learn Python. 1. Learn new knowledge: Learn new concepts in one hour, such as lists and dictionaries. 2. Practice and exercises: Use one hour to perform programming exercises, such as writing small programs. Through reasonable planning and perseverance, you can master the core concepts of Python in a short time.

Python is easier to learn and use, while C is more powerful but complex. 1. Python syntax is concise and suitable for beginners. Dynamic typing and automatic memory management make it easy to use, but may cause runtime errors. 2.C provides low-level control and advanced features, suitable for high-performance applications, but has a high learning threshold and requires manual memory and type safety management.


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

PhpStorm Mac version
The latest (2018.2.1) professional PHP integrated development tool

ZendStudio 13.5.1 Mac
Powerful PHP integrated development environment

WebStorm Mac version
Useful JavaScript development tools

Safe Exam Browser
Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

Notepad++7.3.1
Easy-to-use and free code editor