Home  >  Article  >  Backend Development  >  How to Split a Pandas DataFrame Based on a Column Value Threshold?

How to Split a Pandas DataFrame Based on a Column Value Threshold?

DDD
DDDOriginal
2024-10-19 22:30:29766browse

How to Split a Pandas DataFrame Based on a Column Value Threshold?

Splitting a Pandas DataFrame by a Column Value

Consider a scenario where you have a DataFrame with a column named 'Sales'. You want to segregate this DataFrame into two based on the values in the 'Sales' column, such that the first DataFrame contains data where 'Sales' is less than a specified threshold, while the second DataFrame includes data where 'Sales' is greater than or equal to the threshold.

To achieve this, you can leverage boolean indexing in Pandas. Here's an example:

<code class="python">import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({'Sales': [10, 20, 30, 40, 50], 'A': [3, 4, 7, 6, 1]})
print(df)

# Set the threshold (s)
s = 30

# Split the DataFrame based on the 'Sales' column
df1 = df[df['Sales'] >= s]
print(df1)

df2 = df[df['Sales'] < s]
print(df2)

Output:

   A  Sales
0  3     10
1  4     20
2  7     30
3  6     40
4  1     50

   A  Sales
2  7     30
3  6     40
4  1     50

   A  Sales
0  3     10
1  4     20

Alternatively, you can use the inverse mask operator (~) to achieve the same result:

<code class="python">mask = df['Sales'] >= s
df1 = df[mask]
df2 = df[~mask]
print(df1)
print(df2)</code>

This will have the same effect as the previous example.

The above is the detailed content of How to Split a Pandas DataFrame Based on a Column Value Threshold?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn