Home > Article > Backend Development > Reading csv files using pandas
The following is an article about how to use pandas to read csv files by specifying columns. It has a good reference value and I hope it will be helpful to everyone. Let’s take a look together
According to the tutorial, I realized reading the first few rows of data in the csv file. I immediately thought about whether it was possible to realize the data in the first few columns. After many attempts, I finally found a method.
The reason why I want to read the first few columns is because a csv file I have happens to have no available data in the next few columns, but it always exists. The original data is as follows:
GreydeMac-mini:chapter06 greyzhang$ cat data.csv
1,name_01,coment_01,,,, 2,name_02,coment_02,,,, 3,name_03,coment_03,,,, 4,name_04,coment_04,,,, 5,name_05,coment_05,,,, 6,name_06,coment_06,,,, 7,name_07,coment_07,,,, 8,name_08,coment_08,,,, 9,name_09,coment_09,,,, 10,name_10,coment_10,,,, 11,name_11,coment_11,,,, 12,name_12,coment_12,,,, 13,name_13,coment_13,,,, 14,name_14,coment_14,,,, 15,name_15,coment_15,,,, 16,name_16,coment_16,,,, 17,name_17,coment_17,,,, 18,name_18,coment_18,,,, 19,name_19,coment_19,,,, 20,name_20,coment_20,,,, 21,name_21,coment_21,,,,
If you use pandas to read all the data, the following results will appear when printing:
In [41]: data = pd.read_csv('data.csv')
In [42]: data Out[42]: 1 name_01 coment_01 Unnamed: 3 Unnamed: 4 Unnamed: 5 Unnamed: 6 0 2 name_02 coment_02 NaN NaN NaN NaN 1 3 name_03 coment_03 NaN NaN NaN NaN 2 4 name_04 coment_04 NaN NaN NaN NaN 3 5 name_05 coment_05 NaN NaN NaN NaN 4 6 name_06 coment_06 NaN NaN NaN NaN 5 7 name_07 coment_07 NaN NaN NaN NaN 6 8 name_08 coment_08 NaN NaN NaN NaN 7 9 name_09 coment_09 NaN NaN NaN NaN 8 10 name_10 coment_10 NaN NaN NaN NaN 9 11 name_11 coment_11 NaN NaN NaN NaN 10 12 name_12 coment_12 NaN NaN NaN NaN 11 13 name_13 coment_13 NaN NaN NaN NaN 12 14 name_14 coment_14 NaN NaN NaN NaN 13 15 name_15 coment_15 NaN NaN NaN NaN 14 16 name_16 coment_16 NaN NaN NaN NaN 15 17 name_17 coment_17 NaN NaN NaN NaN 16 18 name_18 coment_18 NaN NaN NaN NaN 17 19 name_19 coment_19 NaN NaN NaN NaN 18 20 name_20 coment_20 NaN NaN NaN NaN 19 21 name_21 coment_21 NaN NaN NaN NaN
said that this will not bring any obstacles to me during the learning process, but staying in the command line terminal interface After a long time, I always like a slightly refreshing style. Using the read_csv parameter usecols can reduce this confusion to a certain extent.
In [45]: data = pd.read_csv('data.csv',usecols=[0,1,2,3])
In [46]: data Out[46]: 1 name_01 coment_01 Unnamed: 3 0 2 name_02 coment_02 NaN 1 3 name_03 coment_03 NaN 2 4 name_04 coment_04 NaN 3 5 name_05 coment_05 NaN 4 6 name_06 coment_06 NaN 5 7 name_07 coment_07 NaN 6 8 name_08 coment_08 NaN 7 9 name_09 coment_09 NaN 8 10 name_10 coment_10 NaN 9 11 name_11 coment_11 NaN 10 12 name_12 coment_12 NaN 11 13 name_13 coment_13 NaN 12 14 name_14 coment_14 NaN 13 15 name_15 coment_15 NaN 14 16 name_16 coment_16 NaN 15 17 name_17 coment_17 NaN 16 18 name_18 coment_18 NaN 17 19 name_19 coment_19 NaN 18 20 name_20 coment_20 NaN 19 21 name_21 coment_21 NaN
In order to be able to see the "boundary" of the data, the first column of invalid data is displayed when reading. In normal use, maybe we want to remove the information of the last column in the above result. Then we only need to remove the column number of the last column in the parameters.
In [47]: data = pd.read_csv('data.csv',usecols=[0,1,2])
In [48]: data Out[48]: 1 name_01 coment_01 0 2 name_02 coment_02 1 3 name_03 coment_03 2 4 name_04 coment_04 3 5 name_05 coment_05 4 6 name_06 coment_06 5 7 name_07 coment_07 6 8 name_08 coment_08 7 9 name_09 coment_09 8 10 name_10 coment_10 9 11 name_11 coment_11 10 12 name_12 coment_12 11 13 name_13 coment_13 12 14 name_14 coment_14 13 15 name_15 coment_15 14 16 name_16 coment_16 15 17 name_17 coment_17 16 18 name_18 coment_18 17 19 name_19 coment_19 18 20 name_20 coment_20 19 21 name_21 coment_21
Related recommendations:
Use pandas to read the first few lines specified in the csv file
The above is the detailed content of Reading csv files using pandas. For more information, please follow other related articles on the PHP Chinese website!