Home >Backend Development >Python Tutorial >Use python to obtain web page encoding method implementation code

Use python to obtain web page encoding method implementation code

高洛峰
高洛峰Original
2017-03-13 09:41:071681browse

This article mainly introduces the relevant information about using python to obtain the web page encoding method to implement the code. Friends in need can refer to

python to obtain the web page encoding method to implement the code


<span style="font-family: Arial, Helvetica, sans-serif; background-color: rgb(255, 255, 255);">
   </span><span style="font-family: Arial, Helvetica, sans-serif; background-color: rgb(255, 255, 255);">
python开发,自动化获取网页编码方式用到了chardet库,字符集检测,这个类在python2.7中没有,需要在官网上下载。
这里我下载好了chardet-2.3.0.tar.gz压缩包文件,只需要将压缩包文件解压后的chardet文件放到python安装包下的
python27/lib/site-packages/下,就可以了。</span>

Then import chardet

An automated detection function is written below to detect the Url connection, and then Returns the encoding method of the web page URL.


import chardet #字符集检测 
import urllib 
 
url="http://www.jd.com" 
 
 
def automatic_detect(url): 
  content=urllib.urlopen(url).read() 
  result=chardet.detect(content) 
 
  encoding=result[&#39;encoding&#39;] 
 
  return encoding 
 
urls=[&#39;http://www.baidu.com&#39;,&#39;http://www.163.com&#39;,&#39;http://dangdang.com&#39;] 
for url in urls: 
  print url,automatic_detect(url)

The detect method of the chardet class is used above, returns the dictionary, and then takes out the encoding method encoding

Thanks for reading, I hope it can help everyone , thank you all for your support of this site!

The above is the detailed content of Use python to obtain web page encoding method implementation code. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn