The Python crawler needs to crawl a total of 65 pages of data, and the number of columns in each page of data is uncertain. Now I can capture the data of each column, but because the number of columns is uncertain, the file name written cannot be determined. The problem is how to write the x-th column data into the x-th file. That is, how to dynamically select the file name of file=. code show as below:
f_1 = open('fitment/1.txt', 'a')
f_2 = open('fitment/2.txt', 'a')
f_3 = open('fitment/3.txt', 'a')
for i in range(66):
pr = random.choice(proxy)
url = 'https://*****' + str(i) + '****'
page_url = requests.get(url, headers=head, proxies=pr)
page_get = page_url.text
page_text = BeautifulSoup(page_get, 'lxml')
fitment_1 = page_text.find_all('tr', {'class': 'fitment listRowEven'})
for each_tag_1 in fitment_1:
td_text_1 = each_tag_1.find_all('td')
for x in range(len(td_text_1)+1):
print(td_text_1[x].string, file=)
The structure of the web page is as follows, each tr tag is a column, and the specific data to be captured is located in each td tag
<tr>
<td>...</td>
<td>...</td>
<td>...</td>
<td>...</td>
</tr>
<tr>
<td>...</td>
<td>...</td>
<td>...</td>
<td>...</td>
</tr>
ringa_lee2017-06-12 09:25:40
Don’t define the open file object first, you can open the corresponding file operation based on the number of columns
with open('列数.txt', 'a') as f:
f.write('内容')