Python 爬虫抓取代理IP

Question

爬取代理IP网址是：http://ip84.com以上是HTML网页内容，需获取IP地址，端口号，地方，是否高匿，两个时间 一下是我写的Python，但只能实现部分，请各位大神指点下谢谢。。。。 {代码...} 结果是类似下面的，不一...

天蓬老师 · Answer

你好！建議使用requests 和BeautifulSoup 來解析。一下是我的程式碼（Python3)和結果：

from bs4 import BeautifulSoup
import requests
r = requests.get("http://ip84.com")
content = r.text
soup = BeautifulSoup(content,"html.parser")
ListTable = soup.find_all("table",class_ = "list")
for table in ListTable:
    ListTr = table.find_all("tr")
    for tr in ListTr:
        try:
            ListTd = tr.find_all("td")
            ipaddr = str(ListTd[0].get_text()).strip()
            port = str(ListTd[1].get_text()).strip()
            zone = str(ListTd[2].get_text()).strip().replace("
","")
            nmd = str(ListTd[3].get_text()).strip()
            xy = str(ListTd[4].get_text()).strip()
            speed = str(ListTd[5].get_text()).strip()
            time = str(ListTd[6].get_text()).strip()
            print(ipaddr + " " + port + " " + zone + " " + nmd + " " + xy + " " + speed + " " + time)
        except Exception as e:
            print("---------------------------------------------")

運行結果：

Good Luck ! ^_<

黄舟 · Answer

還是來看看這篇：https://segmentfault.com/n/1330000005070016

Python 爬虫抓取代理IP

全部回覆(2)我來回復