search

Home  >  Q&A  >  body text

beautifulsoup - 【答疑】python下如何把unicode编码的数据转为utf-8的?

如题 现在有一个<class 'bs4.element.NavigableString'>type的数据
打印出来是这个样的
[u'3788.00', u'4788.00', u'6388.00', u'2398.00', u'5687.00', u'4088.00', u'4187.00', u'4087.00', u'2587.00', u'5188.00', u'4887.00', u'4287.00', u'4887.00', u'5787.00', u'4887.00', u'4888.00', u'\u8d27\u5230\u4ed8\u6b3e', u'6388.00', u'4987.00', u'5588.00', u'5588.00', u'5588.00', u'3288.00', u'3888.00', u'4788.00', u'4788.00', u'4788.00', u'4788.00', u'5588.00', u'4088.00', u'4788.00', u'4788.00', u'5588.00', u'5588.00', u'6388.00', u'6388.00', u'4788.00', u'5588.00', u'4988.00', u'4788.00', u'6388.00', u'6388.00', u'6388.00', u'5588.00', u'5588.00', u'5588.00', u'6388.00', u'5588.00', u'5588.00', u'4788.00', u'6388.00', u'6388.00', u'6388.00', u'5588.00', u'5588.00', u'6588.00', u'6588.00', u'5588.00', u'5588.00', u'5788.00']

当我用int()类型转换时 提示我:
ValueError: invalid literal for int() with base 10: '3788.00'

然后就在网上看到有网友说用 round(float(Price))的方法可行 #Price就是那个'class 'bs4.element.NavigableString'类型的数据

但是提示的是:
UnicodeEncodeError: 'decimal' codec can't encode characters in position 0-3: invalid decimal Unicode string

这种情况下如何解决呢? BTW我是想用list.append方法把上面这个列表添加到其他列表的时候出现的报错(可是明明昨晚还能运行的T-T)

PHP中文网PHP中文网2809 days ago506

reply all(3)I'll reply

  • 高洛峰

    高洛峰2017-04-17 17:31:30

    There is no way to convert it to a floating point number using float是可以的,只是有一个u'u8d27u5230u4ed8u6b3e'(货到付款). Just delete this element or ignore it when processing.

    reply
    0
  • 巴扎黑

    巴扎黑2017-04-17 17:31:30

    Add encode('utf-8') after the data you want to output

    reply
    0
  • 大家讲道理

    大家讲道理2017-04-17 17:31:30

    First of all, the data type you are dealing with is <class 'bs4.element.NavigableString'>type
    This is NavigableString type data in html read with BeautifulSoup.

    In fact, when reading with BS4, you need to use encoding to adjust the data in the html to utf-8

    Example:

    soup = BeautifulSoup(html.read().decode("utf-8"), "html.parser")
    

    Then the NavigableString type data displayed in unicode tags above will be displayed normally.

    reply
    0
  • Cancelreply