Home  >  Article  >  Backend Development  >  How to efficiently get the number of lines in a file

How to efficiently get the number of lines in a file

anonymity
anonymityOriginal
2019-05-25 10:46:022690browse

Simple method:

You need to get the number of lines of a large file (hundreds of thousands of lines) in python.

def file_len(fname):
    with open(fname) as f:
        for i, l in enumerate(f):
            pass    return i + 1

How to efficiently get the number of lines in a file

Effective method (buffer reading strategy):

First look at the running Result:

mapcount : 0.471799945831
simplecount : 0.634400033951
bufcount : 0.468800067902
opcount : 0.602999973297

So the buffer read strategy seems to be the fastest for Windows/Python2.6.

The following is the code:

from __future__ import with_statement
import time
import mmap
import random
from collections import defaultdict
def mapcount(filename):
    f = open(filename, "r+")
    buf = mmap.mmap(f.fileno(), 0)
    lines = 0
    readline = buf.readline
    while readline():
        lines += 1
    return lines
def simplecount(filename):
    lines = 0
    for line in open(filename):
        lines += 1
    return lines
def bufcount(filename):
    f = open(filename)                  
    lines = 0
    buf_size = 1024 * 1024
    read_f = f.read # loop optimization
    buf = read_f(buf_size)
    while buf:
        lines += buf.count('\n')
        buf = read_f(buf_size)
    return lines
def opcount(fname):
    with open(fname) as f:
        for i, l in enumerate(f):
            pass
    return i + 1
counts = defaultdict(list)
for i in range(5):
    for func in [mapcount, simplecount, bufcount, opcount]:
        start_time = time.time()
        assert func("big_file.txt") == 1209138
        counts[func].append(time.time() - start_time)
for key, vals in counts.items():
    print key.__name__, ":", sum(vals) / float(len(vals))

The above is the detailed content of How to efficiently get the number of lines in a file. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn