Home >Backend Development >Python Tutorial >How to Select Specific Rows in a Pandas DataFrame Based on Column Values?

How to Select Specific Rows in a Pandas DataFrame Based on Column Values?

Patricia Arquette
Patricia ArquetteOriginal
2024-12-30 15:30:13249browse

How to Select Specific Rows in a Pandas DataFrame Based on Column Values?

Selecting Rows Based on Column Values in Pandas

In Pandas, filtering a DataFrame to select specific rows based on column values can be done using a combination of comparison operators and Boolean indexing.

Comparing Column Values

To select rows where a column value matches a specific scalar value, use the == operator:

df.loc[df['column_name'] == some_value]

To select rows where a column value is in a list or other iterable value, use the isin operator:

df.loc[df['column_name'].isin(some_values)]

Combining Conditions

Multiple conditions can be combined using the & operator to select rows that satisfy all conditions:

df.loc[(df['column_name'] >= A) & (df['column_name'] <= B)]

Note that parentheses are necessary to ensure proper operator precedence.

Negating Conditions

To select rows that do not match a certain value or are not in a specific list, negate the condition using != or ~:

df.loc[df['column_name'] != some_value]
df = df.loc[~df['column_name'].isin(some_values)] # In-place replacement requires `loc`

Index Optimization

For efficient filtering on frequently used criteria, it can be beneficial to create an index on the column. This allows for faster lookups using df.loc:

df = df.set_index(['B'])
df.loc['one']

Examples

Consider the following DataFrame:

df = pd.DataFrame({'A': ['foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'foo'],
                   'B': ['one', 'one', 'two', 'three', 'two', 'two', 'one', 'three'],
                   'C': np.arange(8), 'D': np.arange(8) * 2})

To select rows where column 'A' equals 'foo':

print(df.loc[df['A'] == 'foo'])

To select rows where column 'B' is in ['one', 'three']:

print(df.loc[df['B'].isin(['one','three'])])

To select rows where column 'B' is 'one' or 'two':

df = df.set_index(['B'])
print(df.loc[df.index.isin(['one','two'])])

The above is the detailed content of How to Select Specific Rows in a Pandas DataFrame Based on Column Values?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn