Python高效编程技巧实战(5)

# -*- coding:utf-8 -*-

"""
实际案例:
1. 过滤掉用户输入中前后多余的空白字符:
'    nick2008@gmail.com'

2. 过滤某windows下编辑文本中的'\r':
'hello world\r\n'

3.去掉文本中的unicode组合符号(音调):
u'...'

解决方案:
方法一:字符串strip(),lstrip(),rstrip()方法去掉字符串两端字符.
方法二:删除单个固定位置的字符,可以使用切片+拼接的方式.
方法三:字符串的replace()方法或正则表达式re.sub()删除任意位置字符.
方法四:字符串translate()方法,可以同时删除多种不同字符.
"""
import re

# -----------1-------------
s = '   abc   123'
print s.strip()
# 'abc 123'

print s.lstrip()
# 'abc 123     '

print s.rstrip()
# '     abc   123'

s = '----abc++++'
print s.strip('+-')     # 去掉前后的- + 号

# -----------2-------------
test = 'abc:123'
te = test[:3] + test[4:]
print te

# -----------3-------------
s = '\tabc\t123\txyz'
s.replace('\t', '')
print s

s1 = '\tabc\t123\txyz\ropq\r'
s1 = re.sub('[\t\r]', '', s1)
print s1

# -----------4-------------
import string
s = 'acb12345xzy'
print s
print s.translate(string.maketrans('abcxyz', 'xyzabc'))   # string.maketrans()调用这个方法,完成映射表

s = 'abc\refg\n\2342\t'
print s.translate(None, '\t\r\n')     # 去掉'\t\r\n',删除操作就只要赋值None就行了

u = u'nǐ shì hǎo de'
print u.translate({0x0301: None})    # 将其中一个声调删除掉
print u.translate(dict.fromkeys([0x0301, 0x030c, 0x0304, 0x0300]))
# 去掉四个声调

相关文章