在做爬虫的时候爬到的中文在控制台中显示乱码,编辑器用的是notepad++
是在powershell中运行的python程序。
在网上搜索了也没有找到合适的解决方法
请问各位这个如何解决? 先谢谢了!
巴扎黑2017-04-17 16:06:36
First of all, has code utf8 been added?
Add the following sentence at the front
#-*_coding:utf8-*-
Secondly, some web pages are not necessarily encoded in utf8. The Chinese in such web pages must be transcoded before printing. For example, many web pages are encoded in GBK. You can use the following line of code to convert them into Unicode
unicodePage = myPage.decode("gbk").encode('utf-8').decode('utf-8')
黄舟2017-04-17 16:06:36
The encoding method on the web page is different from the encoding method of your local environment. Set the local encoding to the encoding method of the web page.
伊谢尔伦2017-04-17 16:06:36
Web pages are generally encoded with uft-8 and gbk on Windows. Just do the appropriate transcoding and you’ll be fine
阿神2017-04-17 16:06:36
Add a sentence to try this
import sys
reload(sys)
sys.setdefaultencoding('utf-8')
怪我咯2017-04-17 16:06:36
Some tricks:
#coding=utf-8
from __future__ import unicode_literals
3 利用unicode做中间桥梁(不得不说在Windows上用Python还是比较不爽的)
PHP中文网2017-04-17 16:06:36
The question is too unspecific and not a good question!
Recommendations for Python 2.x programs on Windows involving Chinese characters:
Python source code files are saved in UTF-8 BOM-free encoding format
Add
to the first or second line of the Python source code file # -*- coding:utf8 -*-
Use Unicdoe objects where Chinese strings appear in the code and wrap them with u''