Python爬蟲一共需要爬取65頁數據,每頁數據的列數不確定。現在我能把每一列的資料抓下來,但是因為列數不確定,寫入的檔案名稱就不能確定。問題在於怎麼才能把第x列資料寫入第x個檔案。也就是如何才能動態選擇file=的檔名。程式碼如下:
f_1 = open('fitment/1.txt', 'a')
f_2 = open('fitment/2.txt', 'a')
f_3 = open('fitment/3.txt', 'a')
for i in range(66):
pr = random.choice(proxy)
url = 'https://*****' + str(i) + '****'
page_url = requests.get(url, headers=head, proxies=pr)
page_get = page_url.text
page_text = BeautifulSoup(page_get, 'lxml')
fitment_1 = page_text.find_all('tr', {'class': 'fitment listRowEven'})
for each_tag_1 in fitment_1:
td_text_1 = each_tag_1.find_all('td')
for x in range(len(td_text_1)+1):
print(td_text_1[x].string, file=)
網頁的結構類別如下,每個tr標籤即為一列,具體要抓取的資料位於每個td標籤內
<tr>
<td>...</td>
<td>...</td>
<td>...</td>
<td>...</td>
</tr>
<tr>
<td>...</td>
<td>...</td>
<td>...</td>
<td>...</td>
</tr>
ringa_lee2017-06-12 09:25:40
先不要定義好open文件對象,可以依照列數開啟對應的文件操作
with open('列数.txt', 'a') as f:
f.write('内容')