Home  >  Article  >  Backend Development  >  Chinese matching example in python regular expression

Chinese matching example in python regular expression

巴扎黑
巴扎黑Original
2016-12-07 11:05:451441browse

#coding=utf-8 
import re 
from urllib2 import urlopen 
webpage = urlopen('http://www.baidu.com')       #获取百度页面的信息
text = webpage.read()                           #读取为文本
tmp = text.decode('utf8')                       #对原文本进行utf8转码, 此处要跟代码的编码格式一致
pat = &#39;<title>(.*)?([\u4e00-\u9fa5]*)?</title>&#39; #对中文进行匹配
re.escape(pat)                                  #对匹配模式中需要转义的符号进行转义
pat = re.compile(pat)                           #compile一下
m = re.search(pat,tmp) 
title = m.group(1) 
print title 
webpage.close()

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn