Recently, the company organized a PHP security programming training, which involved some content about Mysql's "SET NAMES" and mysql_set_charset (mysqli_set_charset):
Speaking of, try to use mysqli_set_charset (mysqli:set_charset) instead of "SET NAMES" , of course, this content is also mentioned in the PHP manual, but there is no explanation why.
Recently, several friends asked me this question, why?
The person who asked There are so many, so I thought I could write a blog to specifically introduce this part of the content.
First of all, many people don’t know what “SET NAMES” does.
My previous article went into depth about MySQL. In the character set setting, we have introduced the three MySQL "environment variables" character_set_client/character_set_connection/character_set_results. Here we will briefly introduce them.
These three variables tell the MySQL server respectively that the client's encoding set is transmitted to The encoding set of the MySQL server, and the encoding set of the results that are expected to be returned by MySQL.
For example, by using "SET NAMES utf8", you tell the server that I am using utf-8 encoding, and I hope you will give it to me too. Returns UTF-8 encoded query results.
Generally, using "SET NAMES" is enough and can ensure correctness. So why does the manual say that it is recommended to use mysqli_set_charset(PHP>=5.0.5 )?
First, let’s take a look at what mysqli_set_charset does (note the asterisk comment, mysql_set_charset is similar):
Copy the code The code is as follows:
//php-5.2.11-SRC/ext/mysqli/mysqli_nonapi.c line 342
PHP_FUNCTION(mysqli_set_charset)
{
MY_MYSQL*mysql;
zval *mysql_link;
char *cs_name = NULL;
unsigned int len;
if (zend_parse_method_parameters(ZEND_NUM_ARGS() TSRMLS_CC, getThis()
, "Os", &mysql_link, mysqli_link_class_entry, &cs_name, &len) == FAILURE) {
return;
}
MYSQLI_FETCH_RESOURCE(mysql, MY_MYSQL*, &mysql_link, "mysqli_link"
, MYSQLI_STATUS_VALID);
if (mysql_set_character_set(mysql->mysql, c s_name )) {
//**Call the corresponding function of libmysql
RETURN_FALSE;
}
RETURN_TRUE;
}
What does mysql_set_character_set do?
Copy code The code is as follows:
//mysql-5.1.30-SRC/libmysql/client.c, line 3166:
int STDCALLmysql_set_character_set(MYSQL*mysql , const char *cs_name)
{
structcharset_info_st *cs;
const char *save_csdir= charsets_dir;
if (mysql->options.charset_dir)
charsets_dir= mysql->options .charset_dir;
if (strlen(cs_name) < MY_CS_NAME_SIZE &&
(cs= get_charset_by_csname(cs_name, MY_CS_PRIMARY, MYF(0))))
{
char buff[MY_CS_NAME_SIZE + 10];
charsets_dir= save_csdir;
/* Skip execution of "SET NAMES" for pre-4.1 servers*/
if (mysql_get_server_version(mysql) < 40100)
return 0;
sprintf(buff, "SET NAMES %s", cs_name) ;
if (!mysql_real_query(mysql, buff, strlen(buff)))
{
mysql->charset= cs;
}
}
//The following is omitted
We can see that in addition to "SET NAMES", mysqli_set_charset also does one more step:
Copy code Code As follows:
sprintf(buff, "SET NAMES %s", cs_name);
if (!mysql_real_query(mysql, buff, strlen(buff)))
{
mysql->charset= cs;
}
And what is the role of charset, a member of the core structure of mysql? Let’s talk about it mysql_real_escape_string(), the difference between this function and mysql_escape_string is that it will consider the "current" character set. So where does this current character set come from?
By the way, you guessed it right, it is mysql->charset .
When mysql_real_string determines the characters of the wide character set, it uses different strategies based on this member variable. For example, if it is utf-8, then libmysql/ctype-utf8.c will be used.
See An example, the default mysql connection character set is latin-1, (classic 5c problem):
Copy code The code is as follows:
$db = mysql_connect('localhost:3737', 'root' ,'123456');
mysql_select_db("test");
$a = "x91x5c";/ /The gbk encoding of "慭", the low byte is 5c, which is "" in ascii
var_dump(addslashes($a));
var_dump(mysql_real_escape_string($a, $db));
mysql_query("set names gbk");
var_dump(mysql_real_escape_string($a, $db));
mysql_set_charset("gbk");
var_dump(mysql_real_escape_string($a, $db));
?>
Because, the gbk encoding low byte of "慭" is 5c, which is "" in ascii, and because except for mysql(i)_set_charset affecting mysql->charset, mysql->charset at other times is The default value, so the result is:
Copy the code The code is as follows:
$ php -f 5c.php
string(3) "慭"
string(3) "慭"
string(3) "慭"
string(2) "慭"Is it clear to everyone now?
http://www.bkjia.com/PHPjc/326354.htmlwww.bkjia.comtruehttp: //www.bkjia.com/PHPjc/326354.htmlTechArticleRecently, the company organized a PHP security programming training, which involved part of Mysql's "SET NAMES" and mysql_set_charset (mysqli_set_charset) content: Speaking of, try to use mysqli...