search

Home  >  Q&A  >  body text

Python gets the status code of the HTTP request (200, 404, etc.)

Python gets the status code of the HTTP request (200, 404, etc.) without accessing the entire page source code, which would be a waste of resources:

输入:segmentfault.com 输出:200
输入:segmentfault.com/nonexistant 输出:404
欧阳克欧阳克2754 days ago1137

reply all(2)I'll reply

  • ringa_lee

    ringa_lee2017-06-28 09:27:31

    Reference article: List of practical Python scripts

    http not only has the get method (requesting the header+body), but also the headmethod, which only requests the header.

    import httplib
    
    def get_status_code(host, path="/"):
        """ This function retreives the status code of a website by requesting
            HEAD data from the host. This means that it only requests the headers.
            If the host cannot be reached or something else goes wrong, it returns
            None instead.
        """
        try:
            conn = httplib.HTTPConnection(host)
            conn.request("HEAD", path)
            return conn.getresponse().status
        except StandardError:
            return None
            
    print get_status_code("segmentfault.com") # prints 200
    print get_status_code("segmentfault.com", "/nonexistant") # prints 404

    reply
    0
  • 怪我咯

    怪我咯2017-06-28 09:27:31

    You use get to request the entire head+body. You can try the head method to access the header directly!

    import requests
    html = requests.head('http://segmentfault.com')    # 用head方法去请求资源头部
    print html.status_code  # 状态码
    
    html = requests.head('/nonexistant')   # 用head方法去请求资源头部
    print html.status_code   # 状态码
    
    # 输出:
    200
    404
    

    reply
    0
  • Cancelreply