search
HomeBackend DevelopmentPython TutorialHow to group data by time interval in Python Pandas?

如何在Python Pandas中按时间间隔分组数据?

Data analysis is increasingly becoming an important aspect of every industry. Many organizations rely heavily on information to make strategic decisions, predict trends, and understand consumer behavior. In such an environment, Python's Pandas library emerges as a powerful device, providing a different range of functionality to successfully manipulate, decompose, and visualize information. One of these powerful features includes grouping data by time intervals.

This article will focus on how to use Pandas to group data by time intervals. We'll explore the syntax, easy-to-understand algorithms, two different approaches, and two fully executable real-world codes based on these approaches.

grammar

The method we will focus on is Pandas's groupby() function, specifically its resampling method. The syntax is as follows:

df.groupby(pd.Grouper(key='date', freq='T')).sum()

In syntax:

  • df − Your DataFrame.

  • groupby(pd.Grouper()) − Function for grouping data.

  • key − The column you want to group by. Here, it's the 'date' column.

  • freq − Frequency of the interval. ('T' stands for minutes, 'H' stands for hours, 'D' stands for days, etc.)

  • sum() - Aggregation function.

algorithm

This is a step-by-step algorithm for grouping data by time intervals -

  • Import the necessary libraries, namely Pandas.

  • Load or create your DataFrame.

  • Convert the date column to a datetime object, if it is not already converted.

  • Use pd.Grouper to apply the groupby() function on the date column, using the desired frequency.

  • Apply sum(), mean() and other aggregate functions

  • Print or store the results.

method

We will consider two different approaches −

Method 1: Group by daily frequency

In this example, we create a DataFrame containing a series of dates and values. We then grouped the data by daily frequency and summed the daily values.

Example

# Import pandas
import pandas as pd

# Create a dataframe
df = pd.DataFrame({
   'date': pd.date_range(start='1/1/2022', periods=100, freq='H'),
   'value': range(100)
})

# Convert 'date' to datetime object, if not already
df['date'] = pd.to_datetime(df['date'])

# Group by daily frequency
daily_df = df.groupby(pd.Grouper(key='date', freq='D')).sum()

print(daily_df)

Output

            value
date             
2022-01-01    276
2022-01-02    852
2022-01-03   1428
2022-01-04   2004
2022-01-05    390

illustrate

Introducing the Pandas library is an absolute requirement for any data manipulation work, and is the main thing we are really going to do in this code. Utilizing the pd.DataFrame() strategy is a subsequent stage during the construction of a DataFrame. The "Date" and "Value" parts make up this dataframe. The pd.date_range() function is used to create a range of hourly timestamps in the "Date" column, while the "Value" part contains only integer ranges. The "Date" column is the result of this interaction.

Although our "Date" column currently handles datetime objects differently, we are increasingly using the pd.to_datetime() function to ensure it is changed. This step is critical because the progress of the collection activity depends on whether the segment has an information type of datetime object.

After this, to group the data by daily ('D') frequency, we use the groupby() function combined with the pd.Grouper() function. After grouping, we use the sum() function to combine all 'value' elements belonging to the same day into a single total.

Finally, a grouped DataFrame is written out, showing the total of each day's values.

Method 2: Group by custom frequency, such as 15 minute intervals

Example

# Import pandas
import pandas as pd

# Create a dataframe
df = pd.DataFrame({
   'date': pd.date_range(start='1/1/2022', periods=100, freq='T'),
   'value': range(100)
})

# Convert 'date' to datetime object, if not already
df['date'] = pd.to_datetime(df['date'])

# Group by 15-minute frequency
custom_df = df.groupby(pd.Grouper(key='date', freq='15T')).sum()

print(custom_df)

Output

                     value
date                      
2022-01-01 00:00:00    105
2022-01-01 00:15:00    330
2022-01-01 00:30:00    555
2022-01-01 00:45:00    780
2022-01-01 01:00:00   1005
2022-01-01 01:15:00   1230
2022-01-01 01:30:00    945

illustrate

The next technique starts with an import of the Pandas library similar to the first, and then creates a DataFrame. This DataFrame is the same as used in the previous model; the only difference is that the 'date' column now contains the timestamp in minutes.

The 'date' column should be a datetime object in order for the collection activity to work properly, and the pd.to_datetime() function ensures that this happens.

In this section, we use the pd.Grouper() function inside the groupby() method to perform grouping operations using a dedicated frequency of 15 minutes ("15T"). To aggregate the "value" entries for each 15-minute interval, we use the sum() function, which is the same method used in the first method.

Complete the code by displaying a new grouped DataFrame showing the sum of the 'value' column for each 15 minute interval.

in conclusion

The powerful features of Pandas include various data operations, one of which is grouping data by time intervals. By using the groupby() function in conjunction with pd.Grouper, we can effectively segment data based on daily frequencies or custom frequencies, enabling efficient and flexible data analysis.

The ability to group data by time intervals enables analysts and businesses to extract meaningful insights from the data. Whether it's calculating the total sales per day, getting the average temperature per hour, or counting website hits every 15 minutes, grouping data by time intervals allows us to better understand trends, patterns, and trends in the data over time. Outliers.

Remember, Python’s Pandas library is a powerful data analysis tool. Learning how to use its features, such as the groupby method, can help you become a more efficient and proficient data analyst or data scientist.

The above is the detailed content of How to group data by time interval in Python Pandas?. For more information, please follow other related articles on the PHP Chinese website!

Statement
This article is reproduced at:tutorialspoint. If there is any infringement, please contact admin@php.cn delete
Python vs. C  : Understanding the Key DifferencesPython vs. C : Understanding the Key DifferencesApr 21, 2025 am 12:18 AM

Python and C each have their own advantages, and the choice should be based on project requirements. 1) Python is suitable for rapid development and data processing due to its concise syntax and dynamic typing. 2)C is suitable for high performance and system programming due to its static typing and manual memory management.

Python vs. C  : Which Language to Choose for Your Project?Python vs. C : Which Language to Choose for Your Project?Apr 21, 2025 am 12:17 AM

Choosing Python or C depends on project requirements: 1) If you need rapid development, data processing and prototype design, choose Python; 2) If you need high performance, low latency and close hardware control, choose C.

Reaching Your Python Goals: The Power of 2 Hours DailyReaching Your Python Goals: The Power of 2 Hours DailyApr 20, 2025 am 12:21 AM

By investing 2 hours of Python learning every day, you can effectively improve your programming skills. 1. Learn new knowledge: read documents or watch tutorials. 2. Practice: Write code and complete exercises. 3. Review: Consolidate the content you have learned. 4. Project practice: Apply what you have learned in actual projects. Such a structured learning plan can help you systematically master Python and achieve career goals.

Maximizing 2 Hours: Effective Python Learning StrategiesMaximizing 2 Hours: Effective Python Learning StrategiesApr 20, 2025 am 12:20 AM

Methods to learn Python efficiently within two hours include: 1. Review the basic knowledge and ensure that you are familiar with Python installation and basic syntax; 2. Understand the core concepts of Python, such as variables, lists, functions, etc.; 3. Master basic and advanced usage by using examples; 4. Learn common errors and debugging techniques; 5. Apply performance optimization and best practices, such as using list comprehensions and following the PEP8 style guide.

Choosing Between Python and C  : The Right Language for YouChoosing Between Python and C : The Right Language for YouApr 20, 2025 am 12:20 AM

Python is suitable for beginners and data science, and C is suitable for system programming and game development. 1. Python is simple and easy to use, suitable for data science and web development. 2.C provides high performance and control, suitable for game development and system programming. The choice should be based on project needs and personal interests.

Python vs. C  : A Comparative Analysis of Programming LanguagesPython vs. C : A Comparative Analysis of Programming LanguagesApr 20, 2025 am 12:14 AM

Python is more suitable for data science and rapid development, while C is more suitable for high performance and system programming. 1. Python syntax is concise and easy to learn, suitable for data processing and scientific computing. 2.C has complex syntax but excellent performance and is often used in game development and system programming.

2 Hours a Day: The Potential of Python Learning2 Hours a Day: The Potential of Python LearningApr 20, 2025 am 12:14 AM

It is feasible to invest two hours a day to learn Python. 1. Learn new knowledge: Learn new concepts in one hour, such as lists and dictionaries. 2. Practice and exercises: Use one hour to perform programming exercises, such as writing small programs. Through reasonable planning and perseverance, you can master the core concepts of Python in a short time.

Python vs. C  : Learning Curves and Ease of UsePython vs. C : Learning Curves and Ease of UseApr 19, 2025 am 12:20 AM

Python is easier to learn and use, while C is more powerful but complex. 1. Python syntax is concise and suitable for beginners. Dynamic typing and automatic memory management make it easy to use, but may cause runtime errors. 2.C provides low-level control and advanced features, suitable for high-performance applications, but has a high learning threshold and requires manual memory and type safety management.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

MantisBT

MantisBT

Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

Dreamweaver Mac version

Dreamweaver Mac version

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

PhpStorm Mac version

PhpStorm Mac version

The latest (2018.2.1) professional PHP integrated development tool

WebStorm Mac version

WebStorm Mac version

Useful JavaScript development tools