Python中如何進行二進位資料讀取操作？-Python教學-PHP中文網

首頁

後端開發

Python教學

Python中如何進行二進位資料讀取操作？

PHPz

May 08, 2023 pm 06:58 PM

python

bytes

bytes：一種字元序列的型別。透過比較 dir(str) 與 dir(bytes) 可知，兩者的屬性與方法很相似，只有少數幾個不同。所以 bytes 也是可以像 string 一樣，對位元組序列有各種操作方法，如查找（find），求長度（len），切割（split），切片等。

bytes 的優點是：Python 內建的方法，不需要的額外的安裝三方模組。

但缺點也很明顯：只能單一查詢，不能一次查詢多個需要的結果。

首先透過 open 的 rb 模式開啟文件，讀取內容為 bytes 類型。尋找特定字串有 find() 方法，但此方法只能找到第一個符合要求的字串索引，並且給出的不是單位元的索引，而是 8 位元一個位元組的索引。當需要尋找多個符合的字串，卻沒有內建的 findall() 方法。如果要查詢多個，過程會很麻煩，首先查到第一個符合的索引 1，以此索引 1 為開始，查詢第二個符合的索引 2，以此類推，直到查詢結束。

with open(path, &#39;rb&#39;) as f:
    datas = f.read()
    start_char = datas.find(b&#39;Start&#39;)
    # start_char2 = datas.find(b&#39;Start&#39;, start_char)
    end_char = datas.find(b&#39;End&#39;, start_char)
    # end_char2 = datas.find(b&#39;End&#39;, start_char2)
    data = datas[start_char:end_char]
    print(data)

注意上述程式碼，start_char 和 end_char 會出現多次，次數並不一定會一樣，需要取得兩個索引之間的內容，但是既無法循環，也不能一次查完。需要多次執行已註解的那行程式碼，取得關鍵字索引。由於不知道檔案資料中會有多少個開始標誌，也就不知道執行多少次，這應該採用循環解決，但似乎沒有可供循環的變數。這使得問題更加複雜。

其次，由於是取得兩個標誌之間的內容，所以，以上過程需要執行兩次。因此過程更顯得繁雜無比。

因此，尋找新的方法，是完全必要的。

bitstring

bitstring 是一個三方包，以位元組流形式讀取二進位檔案。

bitstring.py 檔案的第一句話是：This package defines classes that simplify bit-wise creation, manipulation and interpretation of data.

翻譯如下：這個套件定義的類別簡化了數據的逐位創建、操作和解釋。

簡單理解就是，直接操作 bytes 類型的資料。

有主要的四個類，如下：

Bits -- An immutable container for binary data.
BitArray -- A mutable container for binary data.
ConstBitStream -- An immutable container with streaming methods.
BitStream -- A mutable container with streaming methods.

Bits -- 二進位資料的不可變容器。
BitArray -- 二進位資料的可變容器。
ConstBitStream -- 具有流方法的不可變容器。
BitStream -- 具有流方法的可變容器。

像 bytes 一樣，先讀取檔案內容，尋找關鍵字索引，切片取得資料內容。

# update at 2022/05/06 start
# from bistring import ConstBitStream, BitStream
from bitstring import ConstBitStream, BitStream
# update at 2022/05/06 end

hex_datas = ConstBitStream(filename=path)  # 读取文件内容
start_char = b&#39;Start&#39;
start_chars = hex_datas.findall(start_char, bytealigned=True)  # 一次找到全部符合的，返回一个生成器
start_indexs = []
for start_char in start_chars:
    start_indexs.append(start_char)

end_char = b&#39;End&#39;
end_indexs = []
for start_index in start_indexs:
    end_chars = hex_datas.find(end_char, start=start_index, bytealigned=True)  # 找到第一个符合的，返回元组
    for end_char in end_chars:
        end_indexs.append(end_char)

result = []
for i in range(min(len(start_indexs), len(end_indexs))):
    hex_data = hex_datas[start_indexs[i]:end_indexs[i]]
    str_data = BitStream.tobytes(hex_data).decode(&#39;utf-8&#39;)
    result.append(str_data)

程式碼分析，首先導入需要的兩個類別：ConstBitStream, BitStream。取得檔案內容，findall() 尋找所有符合的字串索引，find() 尋找第一個符合的字串索引。取開始、結束兩個列表的較小值，切片獲取數據，類型為‘bitstring.ConstBitStream’，BitStream.tobytes() 方法轉為bytes 類型，中文字符會亂碼，所以再用decode() 解碼，得到需要的字串。

整個過程還是簡潔、連續。程式碼中用到了 findall()、find()、tobytes() 方法。另外還有許多小細節要注意，例如，start_indexs 如果為空，後續的程式碼就不該執行了，end_indexs 為空亦是如此。

由此可見，bitstring 這個包還是比較好用的。根據需求，用到的方法比較少，其實還有許多其他的方法，按需選擇。

以上是Python中如何進行二進位資料讀取操作？的詳細內容。更多資訊請關注PHP中文網其他相關文章！

陳述

本文轉載於：亿速云。如有侵權，請聯絡admin@php.cn刪除

Python的科學計算中如何使用陣列？Apr 25, 2025 am 12:28 AM

Arraysinpython，尤其是Vianumpy，ArecrucialInsCientificComputingfortheireftheireffertheireffertheirefferthe.1）Heasuedfornumerericalicerationalation，dataAnalysis和Machinelearning.2）Numpy'Simpy'Simpy'simplementIncressionSressirestrionsfasteroperoperoperationspasterationspasterationspasterationspasterationspasterationsthanpythonlists.3）inthanypythonlists.3）andAreseNableAblequick

您如何處理同一系統上的不同Python版本？Apr 25, 2025 am 12:24 AM

你可以通過使用pyenv、venv和Anaconda來管理不同的Python版本。 1）使用pyenv管理多個Python版本：安裝pyenv，設置全局和本地版本。 2）使用venv創建虛擬環境以隔離項目依賴。 3）使用Anaconda管理數據科學項目中的Python版本。 4）保留系統Python用於系統級任務。通過這些工具和策略，你可以有效地管理不同版本的Python，確保項目順利運行。

與標準Python陣列相比，使用Numpy數組的一些優點是什麼？Apr 25, 2025 am 12:21 AM

numpyarrayshaveseveraladagesoverandastardandpythonarrays：1）基於基於duetoc的iMplation，2）2）他們的aremoremoremorymorymoremorymoremorymoremorymoremoremory，尤其是WithlargedAtasets和3）效率化，效率化，矢量化函數函數函數函數構成和穩定性構成和穩定性的操作，製造

陣列的同質性質如何影響性能？Apr 25, 2025 am 12:13 AM

數組的同質性對性能的影響是雙重的：1)同質性允許編譯器優化內存訪問，提高性能；2)但限制了類型多樣性，可能導致效率低下。總之，選擇合適的數據結構至關重要。

編寫可執行python腳本的最佳實踐是什麼？Apr 25, 2025 am 12:11 AM

到CraftCraftExecutablePythcripts，lollow TheSebestPractices：1）Addashebangline（＃！/usr/usr/bin/envpython3）tomakethescriptexecutable.2）setpermissionswithchmodwithchmod xyour_script.3）

Numpy數組與使用數組模塊創建的數組有何不同？Apr 24, 2025 pm 03:53 PM

numpyArraysareAreBetterFornumericalialoperations andmulti-demensionaldata，而learthearrayModuleSutableforbasic，內存效率段

Numpy數組的使用與使用Python中的數組模塊陣列相比如何？Apr 24, 2025 pm 03:49 PM

numpyArraySareAreBetterForHeAvyNumericalComputing，而lelethearRayModulesiutable-usemoblemory-connerage-inderabledsswithSimpleDatateTypes.1）NumpyArsofferVerverVerverVerverVersAtility andPerformanceForlargedForlargedAtatasetSetsAtsAndAtasEndCompleXoper.2）

CTYPES模塊與Python中的數組有何關係？Apr 24, 2025 pm 03:45 PM

ctypesallowscreatingingangandmanipulatingc-stylarraysinpython.1）usectypestoInterfacewithClibrariesForperfermance.2）createc-stylec-stylec-stylarraysfornumericalcomputations.3）passarraystocfunctions foreforfunctionsforeffortions.however.however，However，HoweverofiousofmemoryManageManiverage，Pressiveo，Pressivero

See all articles