Home >Backend Development >Python Tutorial >How to Efficiently Create a Pandas DataFrame with Sequential Rows?
Creating a Pandas Dataframe with Sequential Rows
In the task of data analysis, it is often necessary to create a Pandas DataFrame and iteratively add rows to it. To achieve this, several methods are available, each with its own advantages.
One approach is to use the pd.DataFrame() constructor with the columns parameter to specify the desired column names. An empty DataFrame is created, then rows can be added one by one using the _set_value() method to set individual field values. However, this method is inefficient if multiple fields need to be added simultaneously for each row.
A more efficient solution is to use the df.loc[i] syntax, where i represents the row index. By assigning a list of values to df.loc[i], the entire row at index i can be populated in one step. This approach is considerably faster for large datasets, as it avoids the need for multiple _set_value() calls.
To demonstrate this method, consider the following code snippet:
import numpy as np import pandas as pd df = pd.DataFrame(columns=['lib', 'qty1', 'qty2']) for i in range(5): df.loc[i] = ['name' + str(i)] + list(np.random.randint(10, size=2)) print(df)
This code creates an empty DataFrame with three columns: 'lib', 'qty1', and 'qty2'. It then generates five rows of data, with 'name' followed by two randomly generated integer values in the remaining columns. The result is a DataFrame with the structure and data specified:
lib qty1 qty2 0 name0 3 3 1 name1 2 4 2 name2 2 8 3 name3 2 1 4 name4 9 6
The above is the detailed content of How to Efficiently Create a Pandas DataFrame with Sequential Rows?. For more information, please follow other related articles on the PHP Chinese website!