Home  >  Article  >  Backend Development  >  How to Optimize Fixed Width File Parsing in Python?

How to Optimize Fixed Width File Parsing in Python?

DDD
DDDOriginal
2024-10-31 05:26:30519browse

How to Optimize Fixed Width File Parsing in Python?

Optimizing Fixed Width File Parsing

To efficiently parse fixed-width files, one may consider leveraging Python's struct module. This approach leverages C for improved speed, as demonstrated in the following example:

<code class="python">import struct

fieldwidths = (2, -10, 24)
fmtstring = ' '.join('{}{}'.format(abs(fw), 'x' if fw < 0 else 's') for fw in fieldwidths)

unpack = struct.Struct(fmtstring).unpack_from  # Alias.
parse = lambda line: tuple(s.decode() for s in unpack(line.encode()))

line = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789\n'
fields = parse(line)
print('fields: {}'.format(fields))</code>

Alternatively, string slicing can be employed. To enhance efficiency, consider defining a lambda function that compiles slices at runtime, as seen in the below optimized version:

<code class="python">def make_parser(fieldwidths):
    cuts = tuple(cut for cut in accumulate(abs(fw) for fw in fieldwidths))
    pads = tuple(fw < 0 for fw in fieldwidths)  # bool flags for padding fields
    flds = tuple(zip_longest(pads, (0,) + cuts, cuts))[:-1]  # ignore final one
    slcs = ', '.join('line[{}:{}]'.format(i, j) for pad, i, j in flds if not pad)
    parse = eval('lambda line: ({})\n'.format(slcs))  # Create and compile source code.
    # Optional informational function attributes.
    parse.size = sum(abs(fw) for fw in fieldwidths)
    parse.fmtstring = ' '.join('{}{}'.format(abs(fw), 'x' if fw < 0 else 's')
                                                for fw in fieldwidths)
    return parse</code>

The above is the detailed content of How to Optimize Fixed Width File Parsing in Python?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn