Home  >  Q&A  >  body text

java - python使用httplib库如何实现请求失败重试机制?

高洛峰高洛峰2765 days ago413

reply all(2)I'll reply

  • 迷茫

    迷茫2017-04-18 10:00:30

    Thank God I just found the solution

    A summary of some techniques for using python crawlers to crawl websites - Python - Bole Online http://python.jobbole.com/81997/ The original text is here

    def request(url, cookie='xxx', retries=5):
        ret = urlparse.urlparse(url)  # Parse input URL
        if ret.scheme == 'http':
            conn = httplib.HTTPConnection(ret.netloc)
        elif ret.scheme == 'https':
            conn = httplib.HTTPSConnection(ret.netloc)
        url = ret.path
        if ret.query: url += '?' + ret.query
        if ret.fragment: url += '#' + ret.fragment
        if not url: url = '/'
            conn.request(method='GET', url=url, headers={'Cookie': cookie})
            res = conn.getresponse()
        except Exception, e:
            print e.message
            if retries > 0:
                return request(url=url, retries= retries - 1)
                print 'GET Failed'
                return ''
        if res.status != 200:
            return None
        return res.read()

    The principle is to use a retries variable to store the number of retries, and then recurse itself every time an exception is handled and set the number of retries to -1. If it is determined that the number of retries is less than 0, return directly and print a failure log

  • 大家讲道理

    大家讲道理2017-04-18 10:00:30

    Recursively calling itself to perform retrycount to limit is the most direct method.
    But there is a problem:
    If the other party's address only fails temporarily, such as restarting the service. Retrying immediately still failed. The time for retrying 5 times was very short. When the other party's service was ready, the request was passed because it was retried 5 times

    The mechanism I use is to retry five times, waiting for 30s, 1 minute, 10 minutes, 30 minutes, and 1 hour. If it still fails, it is considered to have failed.
    Of course, this usage is based on specific business logic. Different business needs have different requirements for requests.

  • Cancelreply