Home  >  Article  >  Backend Development  >  Reading csv files using pandas

Reading csv files using pandas

不言
不言Original
2018-04-21 14:46:346697browse

The following is an article about how to use pandas to read csv files by specifying columns. It has a good reference value and I hope it will be helpful to everyone. Let’s take a look together

According to the tutorial, I realized reading the first few rows of data in the csv file. I immediately thought about whether it was possible to realize the data in the first few columns. After many attempts, I finally found a method.

The reason why I want to read the first few columns is because a csv file I have happens to have no available data in the next few columns, but it always exists. The original data is as follows:

GreydeMac-mini:chapter06 greyzhang$ cat data.csv

1,name_01,coment_01,,,,
2,name_02,coment_02,,,,
3,name_03,coment_03,,,,
4,name_04,coment_04,,,,
5,name_05,coment_05,,,,
6,name_06,coment_06,,,,
7,name_07,coment_07,,,,
8,name_08,coment_08,,,,
9,name_09,coment_09,,,,
10,name_10,coment_10,,,,
11,name_11,coment_11,,,,
12,name_12,coment_12,,,,
13,name_13,coment_13,,,,
14,name_14,coment_14,,,,
15,name_15,coment_15,,,,
16,name_16,coment_16,,,,
17,name_17,coment_17,,,,
18,name_18,coment_18,,,,
19,name_19,coment_19,,,,
20,name_20,coment_20,,,,
21,name_21,coment_21,,,,

If you use pandas to read all the data, the following results will appear when printing:

In [41]: data = pd.read_csv('data.csv')

In [42]: data
Out[42]: 
  1 name_01 coment_01 Unnamed: 3 Unnamed: 4 Unnamed: 5 Unnamed: 6
0 2 name_02 coment_02   NaN   NaN   NaN   NaN
1 3 name_03 coment_03   NaN   NaN   NaN   NaN
2 4 name_04 coment_04   NaN   NaN   NaN   NaN
3 5 name_05 coment_05   NaN   NaN   NaN   NaN
4 6 name_06 coment_06   NaN   NaN   NaN   NaN
5 7 name_07 coment_07   NaN   NaN   NaN   NaN
6 8 name_08 coment_08   NaN   NaN   NaN   NaN
7 9 name_09 coment_09   NaN   NaN   NaN   NaN
8 10 name_10 coment_10   NaN   NaN   NaN   NaN
9 11 name_11 coment_11   NaN   NaN   NaN   NaN
10 12 name_12 coment_12   NaN   NaN   NaN   NaN
11 13 name_13 coment_13   NaN   NaN   NaN   NaN
12 14 name_14 coment_14   NaN   NaN   NaN   NaN
13 15 name_15 coment_15   NaN   NaN   NaN   NaN
14 16 name_16 coment_16   NaN   NaN   NaN   NaN
15 17 name_17 coment_17   NaN   NaN   NaN   NaN
16 18 name_18 coment_18   NaN   NaN   NaN   NaN
17 19 name_19 coment_19   NaN   NaN   NaN   NaN
18 20 name_20 coment_20   NaN   NaN   NaN   NaN
19 21 name_21 coment_21   NaN   NaN   NaN   NaN

said that this will not bring any obstacles to me during the learning process, but staying in the command line terminal interface After a long time, I always like a slightly refreshing style. Using the read_csv parameter usecols can reduce this confusion to a certain extent.

In [45]: data = pd.read_csv('data.csv',usecols=[0,1,2,3])

In [46]: data
Out[46]: 
  1 name_01 coment_01 Unnamed: 3
0 2 name_02 coment_02   NaN
1 3 name_03 coment_03   NaN
2 4 name_04 coment_04   NaN
3 5 name_05 coment_05   NaN
4 6 name_06 coment_06   NaN
5 7 name_07 coment_07   NaN
6 8 name_08 coment_08   NaN
7 9 name_09 coment_09   NaN
8 10 name_10 coment_10   NaN
9 11 name_11 coment_11   NaN
10 12 name_12 coment_12   NaN
11 13 name_13 coment_13   NaN
12 14 name_14 coment_14   NaN
13 15 name_15 coment_15   NaN
14 16 name_16 coment_16   NaN
15 17 name_17 coment_17   NaN
16 18 name_18 coment_18   NaN
17 19 name_19 coment_19   NaN
18 20 name_20 coment_20   NaN
19 21 name_21 coment_21   NaN

In order to be able to see the "boundary" of the data, the first column of invalid data is displayed when reading. In normal use, maybe we want to remove the information of the last column in the above result. Then we only need to remove the column number of the last column in the parameters.

In [47]: data = pd.read_csv('data.csv',usecols=[0,1,2])

In [48]: data
Out[48]: 
  1 name_01 coment_01
0 2 name_02 coment_02
1 3 name_03 coment_03
2 4 name_04 coment_04
3 5 name_05 coment_05
4 6 name_06 coment_06
5 7 name_07 coment_07
6 8 name_08 coment_08
7 9 name_09 coment_09
8 10 name_10 coment_10
9 11 name_11 coment_11
10 12 name_12 coment_12
11 13 name_13 coment_13
12 14 name_14 coment_14
13 15 name_15 coment_15
14 16 name_16 coment_16
15 17 name_17 coment_17
16 18 name_18 coment_18
17 19 name_19 coment_19
18 20 name_20 coment_20
19 21 name_21 coment_21

Related recommendations:

Use pandas to read the first few lines specified in the csv file

The above is the detailed content of Reading csv files using pandas. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn