Python 中的 astype() 函數是什麼-Python教學-PHP中文網

首頁

後端開發

Python教學

Python 中的 astype() 函數是什麼

Mary-Kate Olsen

Jan 09, 2025 am 06:51 AM

What is astype() function in Python

理解 Python 中的 astype()

astype() 函數是 Python 中的一個強大方法，主要用於 pandas 函式庫，用於將 DataFrame 或 Series 中的列或資料集轉換為特定資料型別。它也可在 NumPy 中用於將陣列元素轉換為不同類型。

astype() 的基本用法

astype() 函數用於將 pandas 物件（如 Series 或 DataFrame）或 NumPy 陣列的資料類型轉換為另一種類型。

pandas 的語法：

DataFrame.astype(dtype, copy=True, errors='raise')

NumPy 語法：

ndarray.astype(dtype, order='K', casting='unsafe', subok=True, copy=True)

關鍵參數

1.資料型別

要將資料轉換為的目標資料型態。可以使用以下方式指定：

單一型別（例如 float、int、str）。
將列名對應到類型的字典（對於 pandas DataFrames）。

2.複製（pandas 和 NumPy）

預設：True
用途：是否傳回原始資料的副本（如果為True）或就地修改（如果為False）。

3.錯誤（僅限熊貓）

選項：
- 'raise'（預設）：如果轉換失敗則引發錯誤。
- 'ignore'：默默地忽略錯誤。

4.順序（僅限 NumPy）

控制輸出數組的記憶體佈局。選項：
- 'C'：C-連續順序。
- 'F'：Fortran 連續順序。
- 'A'：如果輸入是 Fortran 連續的，則使用 Fortran 順序，否則使用 C 順序。
- 'K'：符合輸入數組的佈局。

5.鑄造（僅限 NumPy）

控制投射行為：
- 'no'：不允許轉換。
- 'equiv'：僅允許位元組順序更改。
- 「安全」：只允許保留值的強制轉換。
- 'same_kind'：僅允許安全強制轉換或某種類型內的強制轉換（例如，float -> int）。
- '不安全'：允許任何資料轉換。

6. subok（僅限 NumPy）

如果為 True，則傳遞子類別；如果為 False，則傳回的陣列將是基底類別數組。

範例

1. pandas 的基本轉換

import pandas as pd

# Example DataFrame
df = pd.DataFrame({'A': ['1', '2', '3'], 'B': [1.5, 2.5, 3.5]})

# Convert column 'A' to integer
df['A'] = df['A'].astype(int)
print(df.dtypes)

輸出：

A     int64
B    float64
dtype: object

2.多列的字典映射

# Convert multiple columns
df = df.astype({'A': float, 'B': int})
print(df.dtypes)

輸出：

DataFrame.astype(dtype, copy=True, errors='raise')

3.使用錯誤='忽略'

ndarray.astype(dtype, order='K', casting='unsafe', subok=True, copy=True)

輸出：

import pandas as pd

# Example DataFrame
df = pd.DataFrame({'A': ['1', '2', '3'], 'B': [1.5, 2.5, 3.5]})

# Convert column 'A' to integer
df['A'] = df['A'].astype(int)
print(df.dtypes)

「two」的轉換失敗，但不會引發錯誤。

4.在 NumPy 使用 astype()

A     int64
B    float64
dtype: object

輸出：

# Convert multiple columns
df = df.astype({'A': float, 'B': int})
print(df.dtypes)

5.在 NumPy 中使用casting='safe'進行鑄造

A    float64
B      int64
dtype: object

輸出：

df = pd.DataFrame({'A': ['1', 'two', '3'], 'B': [1.5, 2.5, 3.5]})

# Attempt conversion with errors='ignore'
df['A'] = df['A'].astype(int, errors='ignore')
print(df)

6.處理 pandas 中的非數字類型

      A    B
0     1  1.5
1   two  2.5
2     3  3.5

輸出：

import numpy as np

# Example array
arr = np.array([1.1, 2.2, 3.3])

# Convert to integer
arr_int = arr.astype(int)
print(arr_int)

7.使用 astype() 進行記憶體最佳化

代碼：

[1 2 3]

輸出：

最佳化前（原始記憶體使用情況）：

arr = np.array([1.1, 2.2, 3.3])

# Attempt an unsafe conversion
try:
    arr_str = arr.astype(str, casting='safe')
except TypeError as e:
    print(e)

最佳化後（最佳化記憶體使用）：

Cannot cast array data from dtype('float64') to dtype('<u32 according to the rule>




<hr>

<h3>
  
  
  <strong>說明：</strong>
</h3>

<ul>
<li>
<p><strong>原始記憶體使用：</strong></p>

<ul>
<li>A 欄位作為 int64 使用 24 個位元組（每個元素 8 個位元組 × 3 個元素）。 </li>
<li>B 欄位作為 float64 使用 24 個位元組（每個元素 8 個位元組 × 3 個元素）。 </li>
</ul>


</li>

<li>

<p><strong>最佳化記憶體使用：</strong></p>

<ul>
<li>A 欄位作為 int8 使用 3 個位元組（每個元素 1 個位元組 × 3 個元素）。 </li>
<li>B 欄位作為 float32 使用 12 個位元組（每個元素 4 個位元組 × 3 個元素）。 </li>
</ul>


</li>

</ul>

<h2>
  
  
  使用較小的資料類型可以顯著減少記憶體使用量，尤其是在處理大型資料集時。
</h2>

<h3>
  
  
  <strong>常見陷阱</strong>
</h3>

<ol>
<li>
<strong>無效轉換</strong>：轉換不相容的類型（例如，當存在非數字值時，將字串轉換為數字類型）。
</li>
</ol>

<pre class="brush:php;toolbar:false">df = pd.DataFrame({'A': ['2022-01-01', '2023-01-01'], 'B': ['True', 'False']})

# Convert to datetime and boolean
df['A'] = pd.to_datetime(df['A'])
df['B'] = df['B'].astype(bool)
print(df.dtypes)

Errors='ignore'靜默錯誤
：謹慎使用，因為它可能會靜默地無法轉換。
精度損失
：從高精度類型（例如 float64）轉換為低精度類型（例如 float32）。

進階範例

1.複雜資料型別轉換

A    datetime64[ns]
B             bool
dtype: object

輸出：

import pandas as pd

# Original DataFrame
df = pd.DataFrame({'A': [1, 2, 3], 'B': [1.1, 2.2, 3.3]})
print("Original memory usage:")
print(df.memory_usage())

# Downcast numerical types
df['A'] = df['A'].astype('int8')
df['B'] = df['B'].astype('float32')

print("Optimized memory usage:")
print(df.memory_usage())

2.在 NumPy 中使用 astype() 來處理結構化陣列

Index    128
A         24
B         24
dtype: int64

輸出：

DataFrame.astype(dtype, copy=True, errors='raise')

總結

astype() 函數是 pandas 和 NumPy 中資料型別轉換的多功能工具。它允許對轉換行為、記憶體最佳化和錯誤處理進行細粒度控制。正確使用其參數（例如 pandas 中的錯誤和 NumPy 中的轉換）可確保穩健且高效的資料類型轉換。

以上是Python 中的 astype() 函數是什麼的詳細內容。更多資訊請關注PHP中文網其他相關文章！

陳述

本文內容由網友自願投稿，版權歸原作者所有。本站不承擔相應的法律責任。如發現涉嫌抄襲或侵權的內容，請聯絡admin@php.cn

您如何將元素附加到Python列表中？May 04, 2025 am 12:17 AM

toAppendElementStoApythonList，usetheappend（）方法forsingleements，Extend（）formultiplelements，andinsert（）forspecificpositions.1）useeAppend（）foraddingoneOnelementAttheend.2）useextendTheEnd.2）useextendexendExendEnd（

您如何創建Python列表？舉一個例子。May 04, 2025 am 12:16 AM

TocreateaPythonlist,usesquarebrackets[]andseparateitemswithcommas.1)Listsaredynamicandcanholdmixeddatatypes.2)Useappend(),remove(),andslicingformanipulation.3)Listcomprehensionsareefficientforcreatinglists.4)Becautiouswithlistreferences;usecopy()orsl

討論有效存儲和數值數據的處理至關重要的實際用例。May 04, 2025 am 12:11 AM

金融、科研、医疗和AI等领域中，高效存储和处理数值数据至关重要。1)在金融中，使用内存映射文件和NumPy库可显著提升数据处理速度。2)科研领域，HDF5文件优化数据存储和检索。3)医疗中，数据库优化技术如索引和分区提高数据查询性能。4)AI中，数据分片和分布式训练加速模型训练。通过选择适当的工具和技术，并权衡存储与处理速度之间的trade-off，可以显著提升系统性能和可扩展性。

您如何創建Python數組？舉一個例子。May 04, 2025 am 12:10 AM

pythonarraysarecreatedusiseThearrayModule，notbuilt-Inlikelists.1）importThearrayModule.2）指定tefifythetypecode，例如，'i'forineizewithvalues.arreaysofferbettermemoremorefferbettermemoryfforhomogeNogeNogeNogeNogeNogeNogeNATATABUTESFELLESSFRESSIFERSTEMIFICETISTHANANLISTS。

使用Shebang系列指定Python解釋器有哪些替代方法？May 04, 2025 am 12:07 AM

除了shebang線，還有多種方法可以指定Python解釋器：1.直接使用命令行中的python命令；2.使用批處理文件或shell腳本；3.使用構建工具如Make或CMake；4.使用任務運行器如Invoke。每個方法都有其優缺點，選擇適合項目需求的方法很重要。

列表和陣列之間的選擇如何影響涉及大型數據集的Python應用程序的整體性能？May 03, 2025 am 12:11 AM

ForhandlinglargedatasetsinPython,useNumPyarraysforbetterperformance.1)NumPyarraysarememory-efficientandfasterfornumericaloperations.2)Avoidunnecessarytypeconversions.3)Leveragevectorizationforreducedtimecomplexity.4)Managememoryusagewithefficientdata

說明如何將內存分配給Python中的列表與數組。May 03, 2025 am 12:10 AM

Inpython，ListSusedynamicMemoryAllocationWithOver-Asalose，而alenumpyArraySallaySallocateFixedMemory.1）listssallocatemoremoremoremorythanneededinentientary上，respizeTized.2）numpyarsallaysallaysallocateAllocateAllocateAlcocateExactMemoryForements，OfferingPrediCtableSageButlessemageButlesseflextlessibility。