cari

Rumah  >  Soal Jawab  >  teks badan

网页爬虫 - python 爬虫开发出现问题 failed to retrieve ALPN result该如何处理

python 爬虫开发出现问题 failed to retrieve ALPN result该如何处理?
有人有遇到过类似问题吗?

代码完全复制:https://binux.blog/2015/01/py...

开发环境:win7 64 python3.5
工具:pyspider

代码:

#!/usr/bin/env python
# -*- encoding: utf-8 -*-
# Created on 2015-01-04 10:42:01
# Project: tutorial_douban_movie

import re
from pyspider.libs.base_handler import *


class Handler(BaseHandler):
    """
    This is a sample script for: pyspider 爬虫教程(一):HTML 和 CSS 选择器
    http://blog.binux.me/2015/01/pyspider-tutorial-level-1-html-and-css-selector/
    """

    @every(minutes=24 * 60)
    def on_start(self):
        self.crawl('http://movie.douban.com/tag/', callback=self.index_page)

    @config(age=24 * 60 * 60)
    def index_page(self, response):
        for each in response.doc('a[href^="http"]').items():
            if re.match("http://movie.douban.com/tag/\w+", each.attr.href, re.U):
                self.crawl(each.attr.href, callback=self.list_page)
                
    @config(age=10*24*60*60, priority=2)
    def list_page(self, response):
        for each in response.doc('HTML>BODY>p#wrapper>p#content>p.grid-16-8.clearfix>p.article>p>TABLE TR.item>TD>p.pl2>A').items():
            self.crawl(each.attr.href, priority=9, callback=self.detail_page)
        # 翻页
        for each in response.doc('HTML>BODY>p#wrapper>p#content>p.grid-16-8.clearfix>p.article>p.paginator>A').items():
            self.crawl(each.attr.href, callback=self.list_page)
    
    @config(priority=3)
    def detail_page(self, response):
        return {
            "url": response.url,
            "title": response.doc('HTML>BODY>p#wrapper>p#content>H1>SPAN').text(),
            "rating": response.doc('HTML>BODY>p#wrapper>p#content>p.grid-16-8.clearfix>p.article>p.indent.clearfix>p.subjectwrap.clearfix>p#interest_sectl>p.rating_wrap.clearbox>P.rating_self.clearfix>STRONG.ll.rating_num').text(),
            "导演": [x.text() for x in response.doc('a[rel="v:directedBy"]').items()],
        }

错误提示:

[E 170113 16:47:13 base_handler:195] HTTP 599: schannel: failed to retrieve ALPN result
    Traceback (most recent call last):
      File "d:\anaconda3\lib\site-packages\pyspider\libs\base_handler.py", line 188, in run_task
        result = self._run_task(task, response)
      File "d:\anaconda3\lib\site-packages\pyspider\libs\base_handler.py", line 167, in _run_task
        response.raise_for_status()
      File "d:\anaconda3\lib\site-packages\pyspider\libs\response.py", line 190, in raise_for_status
        raise http_error
    requests.exceptions.HTTPError: HTTP 599: schannel: failed to retrieve ALPN result
ringa_leeringa_lee2803 hari yang lalu964

membalas semua(1)saya akan balas

  • PHP中文网

    PHP中文网2017-04-18 10:17:06

    Saya tidak pernah menggunakan pyspider, tetapi soalan ini telah ditanya pada GitHub
    HTTP 599: saluran: gagal mendapatkan hasil ALPN

    balas
    0
  • Batalbalas