Home >Backend Development >Python Tutorial >How to Apply a Function to Multiple Columns in a Pandas DataFrame?

How to Apply a Function to Multiple Columns in a Pandas DataFrame?

Susan Sarandon
Susan SarandonOriginal
2024-12-08 15:16:12531browse

How to Apply a Function to Multiple Columns in a Pandas DataFrame?

Applying Functions to Multiple Columns of a Pandas Dataframe

Suppose we have a dataset in a Pandas dataframe with multiple columns, and we want to apply a custom function to two of those columns. This can be a common task in data manipulation and analysis. Here's a step-by-step guide to achieve this:

1. Define the Function:

Define a custom function that takes two inputs, representing the values from the two columns. This function should perform the desired operation on these inputs.

2. Apply the Function Using Lambda:

Pandas provides a lambda function that allows us to apply a function to each row of a dataframe. We can leverage this to apply our custom function to the selected columns.

The syntax for applying a function to multiple columns using lambda is:

df['new_column_name'] = df.apply(lambda x: your_function(x.column_1, x.column_2), axis=1)

Where:

  • new_column_name is the name of the new column that will store the results of the function.
  • your_function is the user-defined function that takes two inputs and returns the desired output.
  • x represents each row of the dataframe, and x.column_1 and x.column_2 access the values from the specified columns.
  • axis=1 indicates that the function is applied to each row, not each column.

3. Example:

Consider the following example dataframe:

df = pd.DataFrame({'ID':['1','2','3'], 'col_1': [0,2,3], 'col_2':[1,4,5]})

Suppose we want to create a new column called 'col_3' that contains a sublist of the original list mylist based on values in col_1 and col_2. We can define a function get_sublist as follows:

def get_sublist(sta, end):
    return ['a', 'b', 'c', 'd', 'e', 'f'][sta:end+1]

Now, we can apply this function using lambda as:

df['col_3'] = df.apply(lambda x: get_sublist(x.col_1, x.col_2), axis=1)

This creates a new column 'col_3' in the dataframe with the desired sublists.

4. Alternatives:

Using lambda is a concise and versatile approach for applying functions to multiple dataframe columns. However, if you prefer a more explicit way, you can also use the apply() method with a custom function that takes a Series as input. This approach involves defining a function that takes a single input representing a row and then manipulates that specific row as needed.

The above is the detailed content of How to Apply a Function to Multiple Columns in a Pandas DataFrame?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn