Home  >  Article  >  Backend Development  >  Python implementation of Linux command xxd -i function introduction

Python implementation of Linux command xxd -i function introduction

高洛峰
高洛峰Original
2017-03-07 15:58:162284browse

1. Linux xxd -i function

The Linux system xxd command displays the file contents in binary or hexadecimal format. If the outfile parameter is not specified, the results are displayed on the terminal screen; otherwise, the results are output to outfile. For detailed usage, please refer to linux command xxd.

This article mainly focuses on the -i option of the xxd command. Use this option to output a C language array definition named inputfile. For example, after executing the echo 12345 > test and xxd -i test commands, the output is:

unsigned char test[] = {
0x31, 0x32, 0x33, 0x34, 0x35, 0x0a
};
unsigned int test_len = 6;

It can be seen that the array name is the input file name (if there is a suffix (The period is replaced by an underscore). Note that 0x0a represents the newline character LF, which is '\n'.

2. Common uses of xxd -i

When the device does not have a file system or does not support dynamic memory management, sometimes Binary files (such as bootloader and firmware) contents are stored inside C code static arrays. At this time, the version array can be automatically generated with the help of the xxd command. Examples are as follows:

1) Use the Linux command xdd to convert the binary file VdslBooter.bin into the hexadecimal file DslBooter.txt:

xxd -i 31479547a2d967270a9bebad8dc6c6a7 DslBooter.txt

Among them, the '-i' option indicates that the output is in C include file style (array mode). The redirection symbol '49cbaf4a47e8be86a9aab5c0f925e125python xxdi.py at the Windows system cmd command prompt, and the execution result is True.

The following code snippet will omit the script and encoding declarations at the head, and the 'main' section at the end.

Before generating a C array, make sure the array name is legal. C language identifiers can only consist of letters, numbers, and underscores, and cannot begin with a number. Additionally, keywords cannot be used as identifiers. All, illegal characters need to be processed. For the rules, please refer to the code comments:

import re
def GenerateCArrayName(inFile):
#字母数字下划线以外的字符均转为下划线
#'int $=5;'的定义在Gcc 4.1.2可编译通过,但此处仍视为非法标识符
inFile = re.sub('[^0-9a-zA-Z\_]', '_', inFile) #'_'改为''可剔除非法字符
#数字开头加双下划线
if inFile[0].isdigit() == True:
inFile = '__' + inFile
#若输入文件名为C语言关键字,则将其大写并加下划线后缀作为数组名
#不能仅仅大写或加下划线前,否则易于用户自定义名冲突
if IsCKeywords(inFile) is True:
inFile = '%s_' %inFile.upper()
return inFile

When executed with print GenerateCArrayName('1a$if1#1_4.txt') , the input parameter string will be converted to __1a_if1_1_4_txt. Similarly, _Bool is converted to _BOOL_.

In order to simulate the Linux command style as much as possible, command line options and parameters need to be provided. The parsing module uses optionparser. For details on its usage, see python command line parsing. The command line implementation of the xxd -i-like function is as follows:

#def ParseOption(base, cols, strip, inFile, outFile):
def ParseOption(base = 16, cols = 12, strip = False, inFile = '', outFile = None):
from optparse import OptionParser
custUsage = '\n xxdi(.py) [options] inFile [outFile]'
parser = OptionParser(usage=custUsage)
parser.add_option('-b', '--base', dest='base',
help='represent values according to BASE(default:16)')
parser.add_option('-c', '--column', dest='col',
help='COL octets per line(default:12)')
parser.add_option('-s', '--strip', action='store_true', dest='strip',
help='only output C array elements')
(options, args) = parser.parse_args()
if options.base is not None:
base = int(options.base)
if options.col is not None:
cols = int(options.col)
if options.strip is not None:
strip = True
if len(args) == 0:
print 'No argument, at least one(inFile)!\nUsage:%s' %custUsage
if len(args) >= 1:
inFile = args[0]
if len(args) >= 2:
outFile = args[1]
return ([base, cols, strip], [inFile, outFile])

The commented out def ParseOption(...) was originally called in the following way:

base = 16; cols = 12; strip = False; inFile = ''; outFile = ''
([base, cols, strip], [inFile, outFile]) = ParseOption(base,
cols, strip, inFile, outFile)

The intention is to modify the base, cols, strip and other parameter values ​​at the same time. But this way of writing is very awkward. Instead, use the function definition method with default parameters. You only need to write ParseOption() when calling. If readers know a better way to write it, please feel free to enlighten me.

Use the -h option to call up the command prompt, which is very close to the Linux style:

E:\PyTest>python xxdi.py -h
Usage:
xxdi(.py) [options] inFile [outFile]
Options:
-h, --help show this help message and exit
-b BASE, --base=BASE represent values according to BASE(default:16)
-c COL, --column=COL COL octets per line(default:12)
-s, --strip only output C array elements

Based on the above exercises, then complete the highlight of this article:

def Xxdi():
#解析命令行选项及参数
([base, cols, strip], [inFile, outFile]) = ParseOption()
import os
if os.path.isfile(inFile) is False:
print ''''%s' is not a file!''' %inFile
return
with open(inFile, 'rb') as file: #必须以'b'模式访问二进制文件
#file = open(inFile, 'rb') #Python2.5以下版本不支持with...as语法
#if True:
#不用for line in file或readline(s),以免遇'0x0a'换行
content = file.read()

#将文件内容"打散"为字节数组
if base is 16: #Hexadecimal
content = map(lambda x: hex(ord(x)), content)
elif base is 10: #Decimal
content = map(lambda x: str(ord(x)), content)
elif base is 8: #Octal
content = map(lambda x: oct(ord(x)), content)
else:
print '[%s]: Invalid base or radix for C language!' %base
return
#构造数组定义头及长度变量
cArrayName = GenerateCArrayName(inFile)
if strip is False:
cArrayHeader = 'unsigned char %s[] = {' %cArrayName
else:
cArrayHeader = ''
cArrayTailer = '};\nunsigned int %s_len = %d;' %(cArrayName, len(content))
if strip is True: cArrayTailer = ''
#print会在每行输出后自动换行
if outFile is None:
print cArrayHeader
for i in range(0, len(content), cols):
line = ', '.join(content[i:i+cols])
print ' ' + line + ','
print cArrayTailer
return
with open(outFile, 'w') as file:
#file = open(outFile, 'w') #Python2.5以下版本不支持with...as语法
#if True:
file.write(cArrayHeader + '\n')
for i in range(0, len(content), cols):
line = reduce(lambda x,y: ', '.join([x,y]), content[i:i+cols])
file.write(' %s,\n' %line)
file.flush()
file.write(cArrayTailer)

Versions below Python2.5 do not support the with...as syntax, and the Linux system used by the author for debugging only has Python2.4.3 installed. Therefore, to run xddi.py in a Linux system, you can only write file = open(.... But this requires handling the closing and exception of the file. For details, see Understanding the with...as... syntax in Python. Note that Python2. When using the with...as syntax in 5, you need to declare from __future__ import with_statement.

You can get the Python version number through platform.python_version(). For example:

##

import platform
#判断Python是否为major.minor及以上版本
def IsForwardPyVersion(major, minor):
#python_version()返回'major.minor.patchlevel',如'2.7.11'
ver = platform.python_version().split('.')
if int(ver[0]) >= major and int(ver[1]) >= minor:
return True
return False

.

After double testing on Windows and Linux systems, Xddi() basically works as expected. Taking the 123456789ABCDEF.txt file (the content is '123456789ABCDEF') as an example, the test results are as follows:

E:\PyTest>python xxdi.py -c 5 -b 2 -s 123456789ABCDEF.txt
[2]: Invalid base or radix for C language!
E:\Pytest>python xxdi.py -c 5 -b 10 -s 123456789ABCDEF.txt

49, 50, 51, 52, 53,
54, 55, 56, 57, 65,
66, 67, 68, 69, 70,
E:\PyTest>python xxdi.py -c 5 -b 10 123456789ABCDEF.txt
unsigned char __123456789ABCDEF_txt[] = {
49, 50, 51, 52, 53,
54, 55, 56, 57, 65,
66, 67, 68, 69, 70,
};
unsigned int __123456789ABCDEF_txt_len = 15;
E:\PyTest>python xxdi.py -c 5 -b 8 123456789ABCDEF.txt
unsigned char __123456789ABCDEF_txt[] = {
061, 062, 063, 064, 065,
066, 067, 070, 071, 0101,
0102, 0103, 0104, 0105, 0106,
};
unsigned int __123456789ABCDEF_txt_len = 15;
E:\PyTest>python xxdi.py 123456789ABCDEF.txt
unsigned char __123456789ABCDEF_txt[] = {
0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, 0x38, 0x39, 0x41, 0x42, 0x43,
0x44, 0x45, 0x46,
};
unsigned int __123456789ABCDEF_txt_len = 15;

Take a slightly larger secondary file as an example. After executing python xxdi.py VdslBooter.bin booter.c, the content of the booter.c file is as follows (the beginning and the end are intercepted):

unsigned char VdslBooter_bin[] = {
0xff, 0x31, 0x0, 0xb, 0xff, 0x3, 0x1f, 0x5a, 0x0, 0x0, 0x0, 0x0,
//... ... ... ...
0x0, 0x0, 0x0, 0x0, 0xff, 0xff, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
};
unsigned int VdslBooter_bin_len = 53588;

It can be seen from the above that the xxdi module implemented by the author is very close to the function of Linux xxd -i, and each has its own advantages and disadvantages. The advantage of xxdi is that it has more complete verification of the validity of array names (keyword check), and the expression of array content is richer (octal and decimal); the disadvantage is that it does not support redirection, and the value width is not fixed (such as 0xb and 0xff). Of course, these shortcomings are not difficult to eliminate. For example, use '0x%02x'%val instead of hex(val) to control the output bit width. However, additional improvements will inevitably increase the complexity of the code, which may result in half the effort with half the effort.

The above is the Python implementation of the Linux command xxd -i function introduced by the editor. I hope it will be helpful to everyone!

For more Python implementation of Linux command xxd -i function introduction related articles, please pay attention to the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn