Home  >  Article  >  Backend Development  >  How to Remove Duplicate Columns in Pandas?

How to Remove Duplicate Columns in Pandas?

Linda Hamilton
Linda HamiltonOriginal
2024-11-01 20:17:02887browse

How to Remove Duplicate Columns in Pandas?

How to Remove Duplicate Columns in Pandas

If you're dealing with a DataFrame that has duplicate columns, you may want to remove them for data consistency or analysis purposes. Here's a straightforward solution to achieve that:

<code class="python">df = df.loc[:,~df.columns.duplicated()].copy()</code>

Mechanism:

  • df.columns.duplicated() creates a Boolean array where True indicates a duplicate column name and False indicates a unique name.
  • Applying ~ (logical negation) flips this array, selecting only the non-duplicated columns.
  • df.loc[:,...] uses Boolean indexing to select these non-duplicated columns, effectively removing the duplicates.
  • The copy() ensures that a new DataFrame is created with the removed duplicates, leaving the original DataFrame unaffected.

Note: This method checks for duplicates based on column names, not column values.

Alternative Approaches:

Removing Duplicate Indexes:

<code class="python">df = df.loc[~df.index.duplicated(),:].copy()</code>

This removes any duplicate rows using a similar mechanism as above, but it checks the index instead of column names.

Removing Duplicates by Values (Cautionary):

<code class="python">df = df.loc[:,~df.apply(lambda x: x.duplicated(),axis=1).all()].copy()</code>

This approach scans each column and removes it if all values in that column are duplicated. However, it should be used with caution as it checks values, not column names, and may not yield the desired results in all cases.

The above is the detailed content of How to Remove Duplicate Columns in Pandas?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn