Home >Backend Development >Python Tutorial >A comparison of four different ways to read files in Python
Python's text processing is a problem that is often encountered. The following article mainly introduces to you the comparison of several different methods for Pythonreading files Information, detailed sample codes are given in the article for everyone to understand and learn. Friends who need it can take a look below.
Preface
Everyone knows that Python has many ways to read files, but when a large file needs to be read, different Reading methods will have different effects. Let’s take a look at the detailed introduction below.
Scenario
Read a 2.9G large file line by line
CPU i7 6820HQ
RAM 32G
Method
Split each line read once String Operation
The following methods all use the with...as method to open the file.
The with statement is suitable for accessing resources to ensure that regardless of whether an exception occurs during use, the necessary "cleaning" operations will be performed to release resources, such as automatic closing of files after use and automatic acquisition of locks in threads. and release etc.
Method 1 The most common way to read files
with open(file, 'r') as fh: for line in fh.readlines(): line.split("|")
Running result: It took 15.4346568584 seconds
The system monitor shows that the memory suddenly jumped from 4.8G to 8.4G. fh.readlines() will save all the lines of data read into the memory. This method is suitable for small files.
Method 2
with open(file, 'r') as fh: line = fh.readline() while line: line.split("|")
Running result: It took 22.3531990051 seconds
There is almost no change in the memory, because the memory Only one row of data is accessed, but the time is obviously longer than the previous time, which is not efficient for further processing of the data.
Method 3
with open(file) as fh: for line in fh: line.split("|")
Running result: It took 13.9956979752 seconds
There is almost no change in the memory and the speed is also Faster than method two.
for line in fh treats the file object fh as an iterable, which automatically uses buffered IO and memory management, so you don't have to worry about large files. This is a very pythonic way!
Method 4 fileinput module
for line in fileinput.input(file): line.split("|")
Running result: It took 26.1103110313 seconds
The memory increased by 200- 300 MB, the slowest of the above.
Summary
The above methods are for reference only. The three recognized methods for reading large files are still the best. However, the specific situation still depends on the performance of the machine and the complexity of data processing.
[Related recommendations]
1. Code example of n lines after Python reads the file
2. Read the file using python Applets
The above is the detailed content of A comparison of four different ways to read files in Python. For more information, please follow other related articles on the PHP Chinese website!