Home >Backend Development >Python Tutorial >How to Retrieve Rows with Unique Values in a Pandas DataFrame?

How to Retrieve Rows with Unique Values in a Pandas DataFrame?

Mary-Kate OlsenOriginal: 2024-11-04 04:11:30678browse

Retrieving Rows by Distinct Column Values: A Comprehensive Guide

Many programming scenarios require extracting rows based on unique values within specific columns. This article explores how to accomplish this using the widely-used Pandas library in Python.

Query:

Consider a dataset with two columns, COL1 and COL2, as shown below:

COL1   COL2
a.com  22
b.com  45
c.com  34
e.com  45
f.com  56
g.com  22
h.com  45

The goal is to retrieve only the rows where COL2 contains unique values. The expected output is:

COL1  COL2
a.com 22
b.com 45
c.com 34
f.com 56

Solution:

The drop_duplicates method in Pandas provides a straightforward way to eliminate duplicate rows based on one or more columns. Here's how to utilize it for this specific task:

<code class="python">import pandas as pd

df = pd.DataFrame({'COL1': ['a.com', 'b.com', 'c.com', 'e.com', 'f.com', 'g.com', 'h.com'],
                  'COL2': [22, 45, 34, 45, 56, 22, 45]})

# Keep only the first occurrence of each unique value in COL2
df = df.drop_duplicates('COL2')

print(df)</code>

Output:

  COL1  COL2
0  a.com    22
1  b.com    45
2  c.com    34
4  f.com    56

Additional Options:

The drop_duplicates method offers additional options to customize the handling of duplicates:

keep='last': Retain the last occurrence of each unique value.
keep=False: Remove all duplicate rows entirely.

Here are examples demonstrating these options:

<code class="python"># Keep only the last occurrence of each unique value in COL2
df = df.drop_duplicates('COL2', keep='last')

# Remove all duplicate rows from the dataset
df = df.drop_duplicates('COL2', keep=False)</code>

The above is the detailed content of How to Retrieve Rows with Unique Values in a Pandas DataFrame?. For more information, please follow other related articles on the PHP Chinese website!

Python pandas for require using this column

Statement：

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Previous article：How can I create a dense NumPy array with a specific data type and filled-in missing values from a sequence of variable-length lists?Next article：How can I create a dense NumPy array with a specific data type and filled-in missing values from a sequence of variable-length lists?

See more

How to Retrieve Rows with Unique Values in a Pandas DataFrame?

Related articles