首页  >  问答  >  正文

python - Scrapy中xpath用到中文报错

问题描述

links = sel.xpath('//i[contains(@title,"置顶")]/following-sibling::a/@href').extract()

报错:ValueError: All strings must be XML compatible: Unicode or ASCII, no NULL bytes or control characters

大家讲道理大家讲道理2638 天前1334

全部回复(2)我来回复

  • 学习ing

    学习ing2017-06-30 09:57:44

    参见文章:解决Scrapy中xpath用到中文报错问题

    解决方法

    方法一:将整个xpath语句转成Unicode

    links = sel.xpath(u'//i[contains(@title,"置顶")]/following-sibling::a/@href').extract()

    方法二:xpath语句用已转成Unicode的title变量

    title = u"置顶"
    links = sel.xpath('//i[contains(@title,"%s")]/following-sibling::a/@href' %(title)).extract()

    方法三:直接用xpath中变量语法($符号加变量名)$title, 传参title即可

    links = sel.xpath('//i[contains(@title,$title)]/following-sibling::a/@href', title="置顶").extract()

    回复
    0
  • ringa_lee

    ringa_lee2017-06-30 09:57:44

    整个字符串前加个u试试

    回复
    0
  • 取消回复