Home >Backend Development >Python Tutorial >Python reads csv file, removes a column and then writes a new file technical tutorial
This article mainly shares with you an example of reading a csv file in Python and then writing a new file after removing a column. It has great reference value and I hope it will be helpful to everyone. Let's follow the editor to take a look. I hope it can help everyone better master Python
. Two methods are used to solve this problem, both of which are existing solutions on the Internet.
Scenario description:
There is a data file saved in text mode. There are now three columns of user_id, plan_id, and mobile_id. The goal is to get new files with only mobile_id, plan_id.
Solution
Option 1: Use python to open and write files Simply play through the data, process the data in the for loop and write it to a new file.
The code is as follows:
def readwrite1( input_file,output_file): f = open(input_file, 'r') out = open(output_file,'w') print (f) for line in f.readlines(): a = line.split(",") x=a[0] + "," + a[1]+"\n" out.writelines(x) f.close() out.close()
Option 2: Read data with pandas Go to the DataFrame and then split the data, and directly use the write function of the DataFrame to write to the new file
The code is as follows:
def readwrite2(input_file,output_file): date_1=pd.read_csv(input_file,header=0,sep=',') date_1[['mobile', 'plan_id']].to_csv(output_file, sep=',', header=True,index=False)
From Looking at the code, pandas logic is clearer.
Let’s take a look at the execution efficiency!
def getRunTimes( fun ,input_file,output_file): begin_time=int(round(time.time() * 1000)) fun(input_file,output_file) end_time=int(round(time.time() * 1000)) print("读写运行时间:",(end_time-begin_time),"ms") getRunTimes(readwrite1,input_file,output_file) #直接撸数据 getRunTimes(readwrite2,input_file,output_file1) #使用dataframe读写数据
Read and write running time: 976 ms
Read and write running time: 777 ms
input_file is about 270,000 For data, the efficiency of dataframe is still faster than that of for loop. If the amount of data is larger, will the effect be more obvious?
Next, try increasing the number of input_file records. The results are as follows
input_file | readwrite1 | readwrite2 |
27W | 976 | 777 |
1989 | 1509 | |
4312 | 3158 |
Using python to filter and delete files in a directory Detailed examples
A brief introduction to Python NLP
Examples to explain python user management system
The above is the detailed content of Python reads csv file, removes a column and then writes a new file technical tutorial. For more information, please follow other related articles on the PHP Chinese website!