Home >Backend Development >Python Tutorial >How Do I Iterate Over Rows in a Pandas DataFrame Efficiently?
Iterating Over Rows in a Pandas DataFrame
Iterating over rows in a Pandas DataFrame allows you to access individual rows and their elements. To achieve this, Pandas provides two commonly used methods: DataFrame.iterrows and DataFrame.T.iteritems().
DataFrame.iterrows:
DataFrame.iterrows is a generator that yields both the row's index and the row itself represented as a Pandas Series. The following code snippet demonstrates its usage:
import pandas as pd df = pd.DataFrame({'c1': [10, 11, 12], 'c2': [100, 110, 120]}) for index, row in df.iterrows(): print(row['c1'], row['c2'])
This will output:
10 100 11 110 12 120
DataFrame.T.iteritems():
DataFrame.T.iteritems() iterates over the columns of a DataFrame. Transposing the DataFrame using .T and using .iteritems() yields both the column name and the row as a Series. Note that this approach is generally less efficient than iterrows:
for column_name, row in df.T.iteritems(): print(column_name, row['c1'], row['c2'])
This will output:
c1 10 11 12 c2 100 110 120
Performance Considerations:
Iterating over pandas objects is generally slower than vectorized operations or function application using the apply() method. If performance is crucial, consider utilizing cython or numba to enhance the performance of iterative operations.
The above is the detailed content of How Do I Iterate Over Rows in a Pandas DataFrame Efficiently?. For more information, please follow other related articles on the PHP Chinese website!