Home  >  Article  >  Backend Development  >  Detailed explanation of memory leaks and Chinese garbled characters in the mysql module in python

Detailed explanation of memory leaks and Chinese garbled characters in the mysql module in python

高洛峰
高洛峰Original
2016-10-18 13:44:511200browse

When connecting to mysql-python, by default everyone will write

con=MySQLdb.connect(user='xxx',passwd='xxx',host='xxx',port=6600,charset='gbk')

Once "gbk" is specified, mysql-python will set use_unicode=True by default. The result is that mysql-python will use python's own codec module to do character decoding, but in practice it is found that the mysql library gbk encoding character set is larger than python's gbk encoding set. Some characters that can be stored in MySQL will throw errors when parsed using Python's codec. A more serious problem is that before mysql-python1.2.3, use_unicode=True caused mysql-python to decode this memory leak bug. All decoded database strings come out as unicode objects through mysql-python. To output them to a file, they need to be encoded again.


The solution is to force use_unicode=False. That is:

con=MySQLdb.connect(user='xxx',passwd='xxx',host='xxx',port=6600,charset='gbk',use_unicode=False)

This way there will be no memory leaks and no need to encode when outputting the file. It also avoids the problem that python's codec cannot parse the strings stored in mysql gbk. Finally, for mysql4, we can leave the charset parameter blank:

con=MySQLdb.connect(user='xxx',passwd='xxx',host='xxx',port=6600,use_unicode=False)

This perfectly solves this problem, haha


Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn