搜尋

首頁  >  問答  >  主體

Python取得HTTP請求的狀態碼(200,404等)

Python取得HTTP請求的狀態碼(200,404等),不存取整個頁面原始碼,那樣太浪費資源:

输入:segmentfault.com 输出:200
输入:segmentfault.com/nonexistant 输出:404
欧阳克欧阳克2735 天前1126

全部回覆(2)我來回復

  • ringa_lee

    ringa_lee2017-06-28 09:27:31

    參考文章:Python實用腳本清單

    http不只get方法(請求頭部+正文),還有head方法,只請求頭部

    import httplib
    
    def get_status_code(host, path="/"):
        """ This function retreives the status code of a website by requesting
            HEAD data from the host. This means that it only requests the headers.
            If the host cannot be reached or something else goes wrong, it returns
            None instead.
        """
        try:
            conn = httplib.HTTPConnection(host)
            conn.request("HEAD", path)
            return conn.getresponse().status
        except StandardError:
            return None
            
    print get_status_code("segmentfault.com") # prints 200
    print get_status_code("segmentfault.com", "/nonexistant") # prints 404

    回覆
    0
  • 怪我咯

    怪我咯2017-06-28 09:27:31

    你用get請求就會請求整個頭部+正文, 可以試下head方法, 直接訪問頭部!

    import requests
    html = requests.head('http://segmentfault.com')    # 用head方法去请求资源头部
    print html.status_code  # 状态码
    
    html = requests.head('/nonexistant')   # 用head方法去请求资源头部
    print html.status_code   # 状态码
    
    # 输出:
    200
    404
    

    回覆
    0
  • 取消回覆