search

Home  >  Q&A  >  body text

python - Scrapy uses xpath to report errors in Chinese

Problem Description

links = sel.xpath('//i[contains(@title,"置顶")]/following-sibling::a/@href').extract()

Error: ValueError: All strings must be XML compatible: Unicode or ASCII, no NULL bytes or control characters

大家讲道理大家讲道理2711 days ago1435

reply all(2)I'll reply

  • 学习ing

    学习ing2017-06-30 09:57:44

    See the article: Solve the problem of Chinese error reporting when xpath is used in Scrapy

    Solution

    Method 1: Convert the entire xpath statement to Unicode

    links = sel.xpath(u'//i[contains(@title,"置顶")]/following-sibling::a/@href').extract()

    Method 2: Use the title variable that has been converted to Unicode in the xpath statement

    title = u"置顶"
    links = sel.xpath('//i[contains(@title,"%s")]/following-sibling::a/@href' %(title)).extract()

    Method 3: Directly use the variable syntax in xpath ($ symbol plus variable name)$title, and pass the parameter title

    links = sel.xpath('//i[contains(@title,$title)]/following-sibling::a/@href', title="置顶").extract()

    reply
    0
  • ringa_lee

    ringa_lee2017-06-30 09:57:44

    Try adding a u before the whole string

    reply
    0
  • Cancelreply