Home > Article > Backend Development > Python extracts the specified location record method after groupby grouping
The following is a Python method for extracting specified location records after groupby grouping. It has a good reference value and I hope it will be helpful to everyone. Let’s take a look together
When conducting data analysis and data modeling, the first thing we have to do is to process the data and extract the information we need. The following introduces some usage of groupby to make data processing more convenient.
When we use groupby to extract information, we often find some statistics (max, min, var, etc.) of the grouped samples. If now we want to take the second record and the third to last record of the grouped sample, how should we do this? We can extract the first and last samples after grouping through first and last. But if we want to take samples at specified locations, there is no ready-made function. We need to write it ourselves. Below I will introduce to you how to implement the above functions.
1) Data introduction
The action table has 3 columns: userid, actionType and actionTime, which respectively represent user id, user behavior type and behavior Time of occurrence. The specific format is as shown below:
2) Grouping operation
a = action.groupby('userid') b = action.groupby('userid')['actionTime'] type(a) type(b)
After grouping, we can see that the data types of a and b are DataFrameGroupBy and SeriesGroupBy
3) Number retrieval operation
①The second/penultimate operation time of different users
action.groupby('userid')['actionTime'].apply(lambda i:i.iloc[1] if len(i)>1 else np.nan) action.groupby('userid')['actionTime'].apply(lambda i:i.iloc[-2] if len(i)>1 else np.nan)
②A certain behavior of different users Second/penultimate operation time
action[action['actionType']==2].groupby('userid')['actionTime'].apply(lambda i:i.iloc[1] if len(i)>1 else np.nan) action[action['actionType']==2].groupby('userid')['actionTime'].apply(lambda i:i.iloc[-2] if len(i)>1 else np.nan)
PS: Because some users may only have one record, direct fetching may cause errors. So I use if to make the judgment first.
In this way we can extract samples at any position of the grouped data.
Related recommendations:
pandas method of getting the row with the maximum value in the groupby group
The above is the detailed content of Python extracts the specified location record method after groupby grouping. For more information, please follow other related articles on the PHP Chinese website!