Home >Backend Development >Python Tutorial >How Can I Efficiently Check for a String's Presence in Large Text Files in Python?

How Can I Efficiently Check for a String's Presence in Large Text Files in Python?

DDD
DDDOriginal
2024-12-12 12:45:11950browse

How Can I Efficiently Check for a String's Presence in Large Text Files in Python?

Inspecting Text Files for String Inclusivity

Consider a scenario where you seek to ascertain the presence of a specific string within text files. Upon its identification, a specific action (X) should be executed; otherwise, an alternate action (Y) should follow. However, a code snippet that aims to achieve this objective consistently returns True, puzzling you about its accuracy.

The culprit responsible for this erroneous behavior is the absence of a condition check within the if statement. The proper implementation should be as follows:

if 'blabla' in line:

However, if your text files are relatively large, it may be more efficient to read the entire file into a string and perform the search using that. Here's an example:

with open('example.txt') as f:
    if 'blabla' in f.read():
        print("true")

For even larger files, you can leverage mmap.mmap() to create a "string-like" object that employs the underlying file instead of loading the entire contents into memory.

import mmap

with open('example.txt') as f:
    s = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
    if s.find('blabla') != -1:
        print('true')

In Python 3, it's worth noting that mmaps resemble bytearray objects, necessitating the modification of the search string to a bytes object:

import mmap

with open('example.txt', 'rb', 0) as file, \
     mmap.mmap(file.fileno(), 0, access=mmap.ACCESS_READ) as s:
    if s.find(b'blabla') != -1:
        print('true')

Moreover, you can utilize regular expressions on mmaps for more advanced search capabilities, such as case-insensitive matching:

import mmap
import re

with open('example.txt', 'rb', 0) as file, \
     mmap.mmap(file.fileno(), 0, access=mmap.ACCESS_READ) as s:
    if re.search(br'(?i)blabla', s):
        print('true')

The above is the detailed content of How Can I Efficiently Check for a String's Presence in Large Text Files in Python?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn