Home  >  Article  >  Backend Development  >  How to Extract Rows with Distinct Values in a Pandas DataFrame?

How to Extract Rows with Distinct Values in a Pandas DataFrame?

Barbara Streisand
Barbara StreisandOriginal
2024-11-04 07:51:01176browse

How to Extract Rows with Distinct Values in a Pandas DataFrame?

Distinct Values Row Retrieval

To extract rows based on distinct values within a column, specifically COL2, the following methods can be employed:

  1. drop_duplicates with Keep First:

    df = df.drop_duplicates('COL2', keep='first')

    This retains the first occurrence of each unique value in COL2.

  2. drop_duplicates with Keep Last:

    df = df.drop_duplicates('COL2', keep='last')

    This maintains the last occurrence of each unique value in COL2.

  3. drop_duplicates with No Keep:

    df = df.drop_duplicates('COL2', keep=False)

    This removes all duplicate rows, resulting in only unique values in COL2.

Example:

Consider the following dataframe:

COL1 COL2
a.com 22
b.com 45
c.com 34
e.com 45
f.com 56
g.com 22
h.com 45

Using the keep_first method produces:

COL1 COL2
a.com 22
b.com 45
c.com 34
f.com 56

The keep_last method yields:

COL1 COL2
c.com 34
f.com 56
g.com 22
h.com 45

Lastly, using the keep_false method produces:

COL1 COL2
c.com 34
f.com 56

The above is the detailed content of How to Extract Rows with Distinct Values in a Pandas DataFrame?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn