Home > Article > Backend Development > Let’s talk about python file data analysis, management and extraction
[Related recommendations: Python3 video tutorial]
Python2.0 cannot be read directly The problem of taking the Chinese path requires writing another function. python3.0 cannot be read directly in 2018.
When I use it now, I find that python3.0 can directly read Chinese paths.
You need to bring or create several txt files. It is best to write some data in them (name, mobile phone number, address)
Writing code The best time is to set a few requirements yourself and clarify the following goals:
import glob import re import xlwt filearray=[] data=[] phone=[] filelocation=glob.glob(r'课堂实训/*.txt') print(filelocation) for i in range(len(filelocation)): file =open(filelocation[i]) file_data=file.readlines() data.append(file_data) print(data) combine_data=sum(data,[]) print(combine_data) for a in combine_data: data1=re.search(r'[0-9]{11}',a) phone.append(data1[0]) phone=list(set(phone)) print(phone) print(len(phone)) #存到excel中 f=xlwt.Workbook('encoding=utf-8') sheet1=f.add_sheet('sheet1',cell_overwrite_ok=True) for i in range(len(phone)): sheet1.write(i,0,phone[i]) f.save('phonenumber.xls')
will generate an excel File
import glob import re import xlwt
globe is used to locate the file, re regular expression, xlwt is used for excel
filelocation=glob.glob(r'课堂实训/*.txt')
All txt files in the specified directory
for i in range(len(filelocation)): file =open(filelocation[i]) file_data=file.readlines() data.append(file_data) print(data)
Read the txt files in the path in a loop , read the files sequentially by serial number
Open the file corresponding to each cycle
Read the data of the txt file for each cycle line by line
Use the append() method to add the data of each line to data
Output in the list, you can see that several txt file data exist in the same list in the form of character columns
combine_data=sum(data,[])
The lists are merged into one list
print(combine_data) for a in combine_data: data1=re.search(r'[0-9]{11}',a) phone.append(data1[0]) phone=list(set(phone)) print(phone) print(len(phone))
set() function: Unordered deduplication, create an unordered set of non-repeating elements
#存到excel中 f=xlwt.Workbook('encoding=utf-8') sheet1=f.add_sheet('sheet1',cell_overwrite_ok=True) for i in range(len(phone)): sheet1.write(i,0,phone[i]) f.save('phonenumber.xls')
Python3 video tutorial】
The above is the detailed content of Let’s talk about python file data analysis, management and extraction. For more information, please follow other related articles on the PHP Chinese website!