Home >Backend Development >Python Tutorial >How to Efficiently Select DataFrame Rows Within a Specific Date Range in Python?

How to Efficiently Select DataFrame Rows Within a Specific Date Range in Python?

Patricia Arquette
Patricia ArquetteOriginal
2024-12-13 13:11:09299browse

How to Efficiently Select DataFrame Rows Within a Specific Date Range in Python?

Selecting DataFrame Rows Within a Date Range

In Python, DataFrames are a powerful tool for storing and manipulating tabular data. One commonly encountered task is filtering rows based on a specified date range. To accomplish this, we have the following options:

Method 1: Using a Boolean Mask

If your DataFrame contains a date column with datetime64[ns] data type, you can create a boolean mask to select rows that fall within the desired range:

# Ensure date column is a datetime64 series
df['date'] = pd.to_datetime(df['date']) 

# Create a boolean mask
mask = (df['date'] > start_date) & (df['date'] <= end_date)

# Select the sub-DataFrame
sub_df = df.loc[mask]

Method 2: Setting a DatetimeIndex

Another efficient approach is to set the date column as the index of the DataFrame, creating a DatetimeIndex:

df = df.set_index(['date'])

# Select rows using index slicing
sub_df = df.loc[start_date:end_date]

This method is particularly useful for frequent date-based selections, as it provides faster performance compared to using a boolean mask.

Example:

Consider the following DataFrame:

>>> df
            value  date
0          0.2   2021-06-01
1          0.3   2021-06-05
2          0.4   2021-06-10
3          0.5   2021-06-15

To select rows for the month of June 2021, we can use either method:

Method 1: Boolean Mask

mask = (df['date'] > '2021-06-01') & (df['date'] <= '2021-06-30')
sub_df = df.loc[mask]

Method 2: DatetimeIndex

df = df.set_index(['date'])
sub_df = df.loc['2021-06-01':'2021-06-30']

Both methods will return the following sub-DataFrame:

>>> sub_df
            value  date
0          0.2   2021-06-01
1          0.3   2021-06-05
2          0.4   2021-06-10
3          0.5   2021-06-15

The above is the detailed content of How to Efficiently Select DataFrame Rows Within a Specific Date Range in Python?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn