Home > Article > Backend Development > Detailed description of the usage of struct.pack() and struct.unpack() in Python
The struct in python is mainly used to process C structure data. When reading, it is first converted to Python's string type, and then converted to Python's structured type. , such as tuple or something~. Generally, the input channels come from files or network binary streams.
1.struct.pack() and struct.unpack()
In the conversion process, a formatting string(format strings), used to specify the conversion method and format.
Let’s talk about the main methods:
1.1 struct.pack(fmt,v1,v2,...)
Put v1, v2 and other parameters The value is wrapped in one layer, and the wrapping method is specified by fmt. The wrapped parameters must strictly conform to fmt. Finally, a wrapped string is returned.
1.2 struct.unpack(fmt,string)
As the name suggests, unpack. For example, pack is packaged, and then unpacked can be used to unpack. Returns a tuple obtained by unpacking data (string), even if there is only one data, it will be unpacked into a tuple. Among them, len(string) must be equal to calcsize(fmt), which involves a calcsizefunction. struct.calcsize(fmt): This is used to calculate the size of the structure described in the fmt format.
The format string consists of one or more format characters. For the description of these format characters, refer to the Python manual as follows:
2. Code example
import struct # native byteorder buffer = struct.pack("ihb", 1, 2, 3) print repr(buffer) print struct.unpack("ihb", buffer) # data from a sequence, network byteorder data = [1, 2, 3] buffer = struct.pack("!ihb", *data) print repr(buffer) print struct.unpack("!ihb", buffer) Output: '\x01\x00\x00\x00\x02\x00\x03' (1, 2, 3) '\x00\x00\x00\x01\x00\x02\x03' (1, 2, 3)
First, package parameters 1,2,3. Before packaging, 1,2,3 obviously belong to integer in pythondata type , after packing, it becomes a C-structured binary string, and when converted to Python's string type, it is displayed as '\x01\x00\x00\x00\x02\x00\x03'. Since this machine is little-endian ('little- endian', please refer to here for the difference between big-endian and little-endian, so the high bits are placed in the low address segment. i represents the int type in the C struct, so this machine The machine occupies 4 bits, and 1 is represented as 01000000; h represents the short type in the C struct, occupying 2 bits, so it is represented as 0200; similarly, b represents the signed char type in the C struct, occupying 1 bit, so it is represented as 03.
The conversion of other structures is also similar. For some special ones, please refer to the Manual of the official document.
At the beginning of the Format string, there is an optional character to determine big endian and little endian. , the list is as follows:
If not appended, the default is @, that is, using the native character order (big endian or little endian), for the size of the C structure and the memory The alignment method is also consistent with the machine (native). For example, some machines have an integer of 2 bits and some machines have a four-bit integer; some machine memories have four-bit alignment, and some have n-bit alignment ( n is unknown, I don’t know how much).
There is also a standard option, which is described as: If you use standard, there is no memory alignment for any type.
For example, in the second half of the applet just now, the first bit in the format string used is! , which is the standard alignment of big endian mode, so the output is '\x00\x00\x00\x01\x00\x02\x03', in which the high bit itself is placed in the high address bit of the memory.
The above is the detailed content of Detailed description of the usage of struct.pack() and struct.unpack() in Python. For more information, please follow other related articles on the PHP Chinese website!