Home >Backend Development >Python Tutorial >Let's talk about python file data analysis, management and extraction

Let's talk about python file data analysis, management and extraction

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBforward: 2022-08-25 11:46:461878browse

This article brings you relevant knowledge about Python. It mainly introduces python file data analysis, governance and extraction. The article provides a detailed introduction around the theme, which has certain reference value and is needed. Friends can refer to it.

[Related recommendations: Python3 video tutorial]

Prerequisite summary

Python2.0 cannot be read directly The problem of taking the Chinese path requires writing another function. python3.0 cannot be read directly in 2018.

When I use it now, I find that python3.0 can directly read Chinese paths.

You need to bring or create several txt files. It is best to write some data in them (name, mobile phone number, address)

Required

Writing code The best time is to set a few requirements yourself and clarify the following goals:

Need to read all corresponding files in the corresponding directory path
Read each corresponding file line by line txt file records
Use regular expressions to get the mobile phone number of each row
Save the mobile phone number into excel

Ideas

1) Read the file
2) Read the data
3) Data sorting
4) Regular expression matching
5) Data go Re
6) Data export and save

Code

import glob
import re
import xlwt
filearray=[]
data=[]
phone=[]
filelocation=glob.glob(r&#39;课堂实训/*.txt&#39;)
print(filelocation)
for i in range(len(filelocation)):
file =open(filelocation[i])
file_data=file.readlines()
data.append(file_data)
print(data)
combine_data=sum(data,[])

print(combine_data)
for a in combine_data:
data1=re.search(r&#39;[0-9]{11}&#39;,a)
phone.append(data1[0])
phone=list(set(phone))
print(phone)
print(len(phone))

#存到excel中
f=xlwt.Workbook(&#39;encoding=utf-8&#39;)
sheet1=f.add_sheet(&#39;sheet1&#39;,cell_overwrite_ok=True)
for i in range(len(phone)):
sheet1.write(i,0,phone[i])
f.save(&#39;phonenumber.xls&#39;)

Running result

will generate an excel File

Analysis

import glob
import re
import xlwt

globe is used to locate the file, re regular expression, xlwt is used for excel

1) Read files

filelocation=glob.glob(r&#39;课堂实训/*.txt&#39;)

All txt files in the specified directory

2) Read data

for i in range(len(filelocation)):
file =open(filelocation[i])
file_data=file.readlines()
data.append(file_data)
print(data)

Read the txt files in the path in a loop , read the files sequentially by serial number
Open the file corresponding to each cycle
Read the data of the txt file for each cycle line by line
Use the append() method to add the data of each line to data
Output in the list, you can see that several txt file data exist in the same list in the form of character columns

3) Data sorting

combine_data=sum(data,[])

The lists are merged into one list

4) Regular expression matching plus data deduplication

print(combine_data)
for a in combine_data:
data1=re.search(r&#39;[0-9]{11}&#39;,a)
phone.append(data1[0])
phone=list(set(phone))
print(phone)
print(len(phone))

set() function: Unordered deduplication, create an unordered set of non-repeating elements

6) Data export and save

#存到excel中
f=xlwt.Workbook(&#39;encoding=utf-8&#39;)
sheet1=f.add_sheet(&#39;sheet1&#39;,cell_overwrite_ok=True)
for i in range(len(phone)):
sheet1.write(i,0,phone[i])
f.save(&#39;phonenumber.xls&#39;)

##Workbook('encoding=utf-8'): Set the encoding of the workbook
add_sheet(' sheet1',cell_overwrite_ok=True): Create the corresponding worksheet
write(x,y,z):The parameters correspond to rows, columns, and values

【Related recommendations:

Python3 video tutorial】

The above is the detailed content of Let's talk about python file data analysis, management and extraction. For more information, please follow other related articles on the PHP Chinese website!

Statement：

This article is reproduced at:jb51.net. If there is any infringement, please contact admin@php.cn delete

Previous article：Python class parameter definition and data expansion method unsqueeze/expandNext article：Python class parameter definition and data expansion method unsqueeze/expand

See more

Let's talk about python file data analysis, management and extraction

Prerequisite summary

Required

Ideas

Code

Running result

Analysis

1) Read files

2) Read data

3) Data sorting

4) Regular expression matching plus data deduplication

6) Data export and save

Related articles