Home >Backend Development >Python Tutorial >How Can I Efficiently Combine Multiple CSV Files into a Single Pandas DataFrame?

How Can I Efficiently Combine Multiple CSV Files into a Single Pandas DataFrame?

Barbara Streisand
Barbara StreisandOriginal
2024-12-18 08:45:10450browse

How Can I Efficiently Combine Multiple CSV Files into a Single Pandas DataFrame?

Reading and Combining Multiple CSV Files into a Single DataFrame

Problem Scenario

The task is to read multiple CSV files from a directory into pandas and combine them into one DataFrame.

pandas Methodologies

Pandas offers intuitive methods for concatenating multiple dataframes:

  • pd.concat(dfs, ignore_index=True): Concatenates dataframes vertically, ignoring the original index.

Implementation

To achieve the desired result, each CSV file is read into a dataframe. The concat method is then utilized to concatenate these individual dataframes into a single comprehensive DataFrame.

Code Snippet:

import glob
import pandas as pd

path = r'C:\DRO\DCL_rawdata_files'
filenames = glob.glob(path + "/*.csv")

dfs = []
for filename in filenames:
    dfs.append(pd.read_csv(filename, header=0))

big_frame = pd.concat(dfs, ignore_index=True)

print(big_frame)

Additional Notes

  • Ensure that all CSV files have the same columns, as concat requires a uniform column structure.
  • Consider adding a column to identify each data source for traceability purposes.
  • Utilize pathlib for more advanced file handling capabilities.

The above is the detailed content of How Can I Efficiently Combine Multiple CSV Files into a Single Pandas DataFrame?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn