search

Home  >  Q&A  >  body text

Difference between Notepad2 and PyCharm - Stack Overflow

The same code cannot pass in the Notepad environment (tested with Notepad), but can pass in Pycharm (Python3.5)
Code :

import urllib
import urllib.request
url = "http://www.baidu.com"
data = urllib.request.urlopen(url).read()
data = data.decode('UTF-8')

This statement can be passed in both environments

data.decode('gbk', 'ignore').encode('UTF-8')
print(data)

Display the crawled web page in Pycharm and display it in the cmd window

UnicodeEncodeError: 'gbk' codec can't encode character 'xbb' in position 26830:
illegal multibyte sequence

Invalid characters must be removed.

import urllib
import urllib.request
url = "http://www.baidu.com"
data = urllib.request.urlopen(url).read()
data.decode('gbk', 'ignore').encode('UTF-8')
print(data)

This is okay, please explain

phpcn_u1582phpcn_u15822745 days ago789

reply all(1)I'll reply

  • 淡淡烟草味

    淡淡烟草味2017-05-18 10:52:11

    You may encounter the same python encoding problem as me, or the encoding support problem of the terminal you are using. Take a look at the questions below.

    【Python coding problem? 】Shared from @SegmentFault, portal: /q/10...

    reply
    0
  • Cancelreply