Home >Backend Development >C++ >How Can I Efficiently Retrieve the Last 10 Lines from a Very Large Text File?

How Can I Efficiently Retrieve the Last 10 Lines from a Very Large Text File?

Barbara Streisand
Barbara StreisandOriginal
2024-12-29 10:20:11252browse

How Can I Efficiently Retrieve the Last 10 Lines from a Very Large Text File?

Efficient Retrieval of Last 10 Lines from Massive Text Files

Determining the most efficient approach to extract the last 10 lines from an exceedingly large text file (exceeding 10GB) requires a strategy that minimizes computational overhead.

Utilizing File Positioning and Reverse Seek

The recommended approach is to navigate to the end of the file using the Seek() method and progressively move backward in the file until encountering 10 newlines. By maintaining a line count, the method identifies the precise starting point to read forward and retrieve the desired lines. This strategy efficiently handles files with a varying number of lines, including those with fewer than 10.

Example Implementation in C#

The following C# code demonstrates the implementation of the aforementioned approach, generalized to locate the last numberOfTokens in a file encoded by encoding and separated by tokenSeparator:

public static string ReadEndTokens(string path, Int64 numberOfTokens, Encoding encoding, string tokenSeparator) {
    int sizeOfChar = encoding.GetByteCount("\n");
    byte[] buffer = encoding.GetBytes(tokenSeparator);
    
    using (FileStream fs = new FileStream(path, FileMode.Open)) {
        Int64 tokenCount = 0;
        Int64 endPosition = fs.Length / sizeOfChar;

        for (Int64 position = sizeOfChar; position < endPosition; position += sizeOfChar) {
            fs.Seek(-position, SeekOrigin.End);
            fs.Read(buffer, 0, buffer.Length);

            if (encoding.GetString(buffer) == tokenSeparator) {
                tokenCount++;
                if (tokenCount == numberOfTokens) {
                    byte[] returnBuffer = new byte[fs.Length - fs.Position];
                    fs.Read(returnBuffer, 0, returnBuffer.Length);
                    return encoding.GetString(returnBuffer);
                }
            }
        }

        // handle case where number of tokens in file is less than numberOfTokens
        fs.Seek(0, SeekOrigin.Begin);
        buffer = new byte[fs.Length];
        fs.Read(buffer, 0, buffer.Length);
        return encoding.GetString(buffer);
    }
}

By utilizing this technique, the retrieval of the last 10 lines from a large text file is accomplished with minimal memory usage and computational complexity, providing an efficient solution for this common file processing scenario.

The above is the detailed content of How Can I Efficiently Retrieve the Last 10 Lines from a Very Large Text File?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn