Home >Backend Development >Python Tutorial >Explanation of the syntax `df[column] = expression` in pandas
Pandas df['column'] = expression
Syntax Detailed Explanation: Used to create, modify or assign columns in Pandas DataFrame (df). Let’s break it down step by step, from basic to advanced.
When a column does not exist in the DataFrame, assigning a value to df['column']
creates a new column.
Example:
<code class="language-python"> import pandas as pd df = pd.DataFrame({'A': [1, 2, 3]}) print(df) # 输出: # A # 0 1 # 1 2 # 2 3 # 创建一个新列 'B',所有值都设置为 0 df['B'] = 0 print(df) # 输出: # A B # 0 1 0 # 1 2 0 # 2 3 0</code>
If the column already exists, assignment replaces its contents.
Example:
<code class="language-python"> df['B'] = [4, 5, 6] # 替换列 'B' 中的值 print(df) # 输出: # A B # 0 1 4 # 1 2 5 # 2 3 6</code>
Can assign values to columns based on calculations or transformations.
Example:
<code class="language-python"> df['C'] = df['A'] + df['B'] # 创建列 'C' 为 'A' 和 'B' 的和 print(df) # 输出: # A B C # 0 1 4 5 # 1 2 5 7 # 2 3 6 9</code>
You can use Pandas’ boolean indexing for conditional assignment.
Example:
<code class="language-python"> df['D'] = df['A'].apply(lambda x: 'Even' if x % 2 == 0 else 'Odd') print(df) # 输出: # A B C D # 0 1 4 5 Odd # 1 2 5 7 Even # 2 3 6 9 Odd</code>
You can use multiple columns in one expression for more complex calculations.
Example:
<code class="language-python"> df['E'] = (df['A'] + df['B']) * df['C'] print(df) # 输出: # A B C D E # 0 1 4 5 Odd 25 # 1 2 5 7 Even 49 # 2 3 6 9 Odd 81</code>
Numerical assignments can use vectorization operations to improve performance.
Example:
<code class="language-python"> df['F'] = df['A'] ** 2 + df['B'] ** 2 # 快速向量化计算 print(df) # 输出: # A B C D E F # 0 1 4 5 Odd 25 17 # 1 2 5 7 Even 49 29 # 2 3 6 9 Odd 81 45</code>
np.where
for conditional logical assignmentYou can use NumPy for conditional assignment.
Example:
<code class="language-python"> import numpy as np df['G'] = np.where(df['A'] > 2, 'High', 'Low') print(df) # 输出: # A B C D E F G # 0 1 4 5 Odd 25 17 Low # 1 2 5 7 Even 49 29 Low # 2 3 6 9 Odd 81 45 High</code>
Assign values to columns based on a custom function applied to the row or column.
Example:
<code class="language-python"> def custom_function(row): return row['A'] * row['B'] df['H'] = df.apply(custom_function, axis=1) print(df) # 输出: # A B C D E F G H # 0 1 4 5 Odd 25 17 Low 4 # 1 2 5 7 Even 49 29 Low 10 # 2 3 6 9 Odd 81 45 High 18</code>
Multiple operations can be chained together to make the code more concise.
Example:
<code class="language-python"> df['I'] = df['A'].add(df['B']).mul(df['C']) print(df) # 输出: # A B C D E F G H I # 0 1 4 5 Odd 25 17 Low 4 25 # 1 2 5 7 Even 49 29 Low 10 49 # 2 3 6 9 Odd 81 45 High 18 81</code>
Use assign()
to create or modify multiple columns in one call.
Example:
<code class="language-python"> df = df.assign( J=df['A'] + df['B'], K=lambda x: x['J'] * 2 ) print(df) # 输出: # A B C D E F G H I J K # 0 1 4 5 Odd 25 17 Low 4 25 5 10 # 1 2 5 7 Even 49 29 Low 10 49 7 14 # 2 3 6 9 Odd 81 45 High 18 81 9 18</code>
Dynamically create column names based on external input.
Example:
<code class="language-python"> columns_to_add = ['L', 'M'] for col in columns_to_add: df[col] = df['A'] + df['B'] print(df)</code>
Assign values to columns based on an external DataFrame or dictionary.
Example:
<code class="language-python"> mapping = {1: 'Low', 2: 'Medium', 3: 'High'} df['N'] = df['A'].map(mapping) print(df) # 输出: # A B C D E F G H I J K N # 0 1 4 5 Odd 25 17 Low 4 25 5 10 Low # 1 2 5 7 Even 49 29 Low 10 49 7 14 Medium # 2 3 6 9 Odd 81 45 High 18 81 9 18 High</code>
apply
, vectorized operations) has better performance than Python loops. df['column'] = expression
Syntax is the core feature of Pandas and has a wide range of uses. It allows:
This makes Pandas a powerful data manipulation and analysis library.
The above is the detailed content of Explanation of the syntax `df[column] = expression` in pandas. For more information, please follow other related articles on the PHP Chinese website!