Rumah  >  Artikel  >  pembangunan bahagian belakang  >  Bagaimana untuk Menggugurkan Pendua Berturut-turut dengan Cekap dalam Panda?

Bagaimana untuk Menggugurkan Pendua Berturut-turut dengan Cekap dalam Panda?

Mary-Kate Olsen
Mary-Kate Olsenasal
2024-11-13 17:29:02509semak imbas

How to Efficiently Drop Consecutive Duplicates in Pandas?

Efficient Dropping of Consecutive Duplicates in Pandas

When working with pandas DataFrames, it's often necessary to remove duplicate values. The built-in drop_duplicates() method, however, removes all instances of duplicate values, including consecutive duplicates. For cases where only consecutive duplicates need to be dropped, there are more efficient methods available.

One approach involves using the shift() function. By comparing the DataFrame against its shifted version (a.shift(-1)), a boolean mask can be created that identifies where consecutive duplicates occur. This mask can then be used to select only the unique values, as seen in the following example:

a.loc[a.shift(-1) != a]

Another method utilizes the diff() function. It calculates the difference between rows and can be used to identify consecutive duplicates. However, it's slower than the shift() method for large datasets.

Using:

a.loc[a.diff() != 0]

The original answer suggested using shift() with a period of -1, but the correct usage is shift(1) (or simply shift()) since the default shift period is 1. This modification ensures that only the first consecutive value is returned:

a.loc[a.shift(1) != a]

Both the shift() and diff() methods provide efficient ways to drop consecutive duplicates in Pandas and should be considered based on the specific context and performance requirements.

Atas ialah kandungan terperinci Bagaimana untuk Menggugurkan Pendua Berturut-turut dengan Cekap dalam Panda?. Untuk maklumat lanjut, sila ikut artikel berkaitan lain di laman web China PHP!

Kenyataan:
Kandungan artikel ini disumbangkan secara sukarela oleh netizen, dan hak cipta adalah milik pengarang asal. Laman web ini tidak memikul tanggungjawab undang-undang yang sepadan. Jika anda menemui sebarang kandungan yang disyaki plagiarisme atau pelanggaran, sila hubungi admin@php.cn