search
HomeBackend DevelopmentPHP Tutorialphp 中 SORT_REGULAR 和 SORT_STRING 的区别

asort 的问题

有一次在使用 php 基本函数 asort 的时候遇到了一个问题:

<?php    $arr = [        "nonce_str" => "441469",        "timestamp" => "1464334314"    ];    asort($arr);    var_dump($arr);?>

这样排序出来的结果是:

array(2) {    ["nonce_str"]=>    string(6) "441469"    ["timestamp"]=>    string(10) "1464334314"}

WTF?

不应该呀,为什么排序出来 4 在 1 的前面呢,字符串不应该是以字符串比较的方式来排序么?

查阅 php.net 参考文档得知 sort 类函数的第二个参数为 SORT_FLAG:

SORT_REGULAR - 正常比较单元(不改变类型)

SORT_NUMERIC - 单元被作为数字来比较

SORT_STRING - 单元被作为字符串来比较

SORT_LOCALE_STRING - 根据当前的区域(locale)设置来把单元当作字符串比较,可以用 setlocale() 来改变。

SORT_NATURAL - 和 natsort() 类似对每个单元以“自然的顺序”对字符串进行排序。 PHP 5.4.0 中新增的。

SORT_FLAG_CASE - 能够与 SORT_STRING 或 SORT_NATURAL 合并(OR 位运算),不区分大小写排序字符串。

看来默认的排序方式是所谓的 SORT_REGULAR。可是,“正常比较单元”是什么意思呢?“不改变类型”又是指什么,既然不改变类型,那么我的两个字符串就应该以字符串比较 strcmp 的顺序排列吧,也就是说,字符串 "1464334314" 应该在 "441469" 前面才对!带着这些疑问,我找到了 php 的源代码。

源码分析

http://git.php.net/?p=php-src.git;a=blob_plain;f=ext/standard/array.c;hb=42be298b3020337653cfcbdd87698b90006b2197

php-src.git/ext/standard/array.c, 887:

/* {{{ proto bool asort(array &array_arg [, int sort_flags])   Sort an array and maintain index association */PHP_FUNCTION(asort){    zval *array;    zend_long sort_type = PHP_SORT_REGULAR;    compare_func_t cmp;    if (zend_parse_parameters(ZEND_NUM_ARGS(), "a/|l", &array, &sort_type) == FAILURE) {        RETURN_FALSE;    }    cmp = php_get_data_compare_func(sort_type, 0);    if (zend_hash_sort(Z_ARRVAL_P(array), cmp, 0) == FAILURE) {        RETURN_FALSE;    }    RETURN_TRUE;}/* }}} */

可以看到,进行具体比较顺序控制的函数指针是 cmp,是通过向 php_get_data_compare_func 传入 sort_type 得到的,sort_type 也就是 SORT_REGULAR / SORT_STRING 这样的标记。

php-src.git/ext/standard/array.c, 632:

static compare_func_t php_get_data_compare_func(zend_long sort_type, int reverse) /* {{{ */{    switch (sort_type & ~PHP_SORT_FLAG_CASE) {        // ...        case PHP_SORT_STRING:            if (sort_type & PHP_SORT_FLAG_CASE) {                if (reverse) {                    return php_array_reverse_data_compare_string_case;                } else {                    return php_array_data_compare_string_case;                }            } else {                if (reverse) {                    return php_array_reverse_data_compare_string;                } else {                    return php_array_data_compare_string;                }            }            break;        // ...        case PHP_SORT_REGULAR:        default:            if (reverse) {                return php_array_reverse_data_compare;            } else {                return php_array_data_compare;            }            break;    }    return NULL;}/* }}} */

在这个函数中我们可以看到,SORT_REGULAR 采用了 php_array_data_compare 进行比较,而 SORT_STRING 采用了 php_array_data_compare_string 进行比较。

php-src.git/ext/standard/array.c, 370:

/* Numbers are always smaller than strings int this function as it * anyway doesn't make much sense to compare two different data types. * This keeps it consistent and simple. * * This is not correct any more, depends on what compare_func is set to. */static int php_array_data_compare(const void *a, const void *b) /* {{{ */{    // ...    if (compare_function(&result, first, second) == FAILURE) {        return 0;    }    ZEND_ASSERT(Z_TYPE(result) == IS_LONG);    return Z_LVAL(result);}/* }}} */

php-src.git/ext/standard/array.c, 465:

static int php_array_data_compare_string(const void *a, const void *b) /* {{{ */{    // ...    return string_compare_function(first, second);}/* }}} */

现在我们得到了两条调用链:

SORT_REGULAR -> php_get_data_compare_func -> php_array_data_compare -> compare_function;

SORT_STRING -> php_get_data_compare_func -> php_array_data_compare_string -> string_compare_function;

SORT_REGULAR

先从第一条的 compare_function 开始:

http://git.php.net/?p=php-src.git;a=blob_plain;f=Zend/zend_operators.c;hb=42be298b3020337653cfcbdd87698b90006b2197

php-src.git/Zend/zend_operators.c, 1818:

ZEND_API int ZEND_FASTCALL compare_function(zval *result, zval *op1, zval *op2) /* {{{ */{    // ...    while (1) {        switch (TYPE_PAIR(Z_TYPE_P(op1), Z_TYPE_P(op2))) {            // ...            case TYPE_PAIR(IS_STRING, IS_STRING):                if (Z_STR_P(op1) == Z_STR_P(op2)) {                    ZVAL_LONG(result, 0);                    return SUCCESS;                }                ZVAL_LONG(result, zendi_smart_strcmp(Z_STR_P(op1), Z_STR_P(op2)));                return SUCCESS;            // ...        }    }}/* }}} */

SORT_REGULAR -> php_get_data_compare_func -> php_array_data_compare -> compare_function -> zendi_smart_strcmp;

php-src.git/Zend/zend_operators.c, 2681:

ZEND_API zend_long ZEND_FASTCALL zendi_smart_strcmp(zend_string *s1, zend_string *s2) /* {{{ */{    int ret1, ret2;    int oflow1, oflow2;    zend_long lval1 = 0, lval2 = 0;    double dval1 = 0.0, dval2 = 0.0;    if ((ret1 = is_numeric_string_ex(s1->val, s1->len, &lval1, &dval1, 0, &oflow1)) &&        (ret2 = is_numeric_string_ex(s2->val, s2->len, &lval2, &dval2, 0, &oflow2))) {#if ZEND_ULONG_MAX == 0xFFFFFFFF        if (oflow1 != 0 && oflow1 == oflow2 && dval1 - dval2 == 0. &&            ((oflow1 == 1 && dval1 > 9007199254740991. /*0x1FFFFFFFFFFFFF*/)            || (oflow1 == -1 && dval1 < -9007199254740991.))) {#else        if (oflow1 != 0 && oflow1 == oflow2 && dval1 - dval2 == 0.) {#endif            /* both values are integers overflown to the same side, and the             * double comparison may have resulted in crucial accuracy lost */            goto string_cmp;        }        if ((ret1 == IS_DOUBLE) || (ret2 == IS_DOUBLE)) {            if (ret1 != IS_DOUBLE) {                if (oflow2) {                    /* 2nd operand is integer > LONG_MAX (oflow2==1) or < LONG_MIN (-1) */                    return -1 * oflow2;                }                dval1 = (double) lval1;            } else if (ret2 != IS_DOUBLE) {                if (oflow1) {                    return oflow1;                }                dval2 = (double) lval2;            } else if (dval1 == dval2 && !zend_finite(dval1)) {                /* Both values overflowed and have the same sign,                 * so a numeric comparison would be inaccurate */                goto string_cmp;            }            dval1 = dval1 - dval2;            return ZEND_NORMALIZE_BOOL(dval1);        } else { /* they both have to be long's */            return lval1 > lval2 ? 1 : (lval1 < lval2 ? -1 : 0);        }    } else {        int strcmp_ret;string_cmp:        strcmp_ret = zend_binary_strcmp(s1->val, s1->len, s2->val, s2->len);        return ZEND_NORMALIZE_BOOL(strcmp_ret);    }}/* }}} */

关键来了,这里我们可以看到,zendi_smart_strcmp 先判断需要比较的两个字符串是否是以数字的形式表现的类型。如果是,则将“数字”一样的字符串作为整型或浮点型进行比较,如果不是,则将字符串用 zend_binary_strcmp 进行比较。

php-src.git/Zend/zend_operators.c, 2791:

ZEND_API zend_uchar ZEND_FASTCALL _is_numeric_string_ex(const char *str, size_t length, zend_long *lval, double *dval, int allow_errors, int *oflow_info) /* {{{ */{    const char *ptr;    int digits = 0, dp_or_e = 0;    double local_dval = 0.0;    zend_uchar type;    zend_long tmp_lval = 0;    int neg = 0;    if (!length) {        return 0;    }    if (oflow_info != NULL) {        *oflow_info = 0;    }    /* Skip any whitespace     * This is much faster than the isspace() function */    while (*str == ' ' || *str == '\t' || *str == '\n' || *str == '\r' || *str == '\v' || *str == '\f') {        str++;        length--;    }    ptr = str;    if (*ptr == '-') {        neg = 1;        ptr++;    } else if (*ptr == '+') {        ptr++;    }    if (ZEND_IS_DIGIT(*ptr)) {        /* Skip any leading 0s */        while (*ptr == '0') {            ptr++;        }        /* Count the number of digits. If a decimal point/exponent is found,         * it's a double. Otherwise, if there's a dval or no need to check for         * a full match, stop when there are too many digits for a long */        for (type = IS_LONG; !(digits >= MAX_LENGTH_OF_LONG && (dval || allow_errors == 1)); digits++, ptr++) {check_digits:            if (ZEND_IS_DIGIT(*ptr)) {                tmp_lval = tmp_lval * 10 + (*ptr) - '0';                continue;            } else if (*ptr == '.' && dp_or_e < 1) {                goto process_double;            } else if ((*ptr == 'e' || *ptr == 'E') && dp_or_e < 2) {                const char *e = ptr + 1;                if (*e == '-' || *e == '+') {                    ptr = e++;                }                if (ZEND_IS_DIGIT(*e)) {                    goto process_double;                }            }            break;        }        if (digits >= MAX_LENGTH_OF_LONG) {            if (oflow_info != NULL) {                *oflow_info = *str == '-' ? -1 : 1;            }            dp_or_e = -1;            goto process_double;        }    } else if (*ptr == '.' && ZEND_IS_DIGIT(ptr[1])) {process_double:        type = IS_DOUBLE;        /* If there's a dval, do the conversion; else continue checking         * the digits if we need to check for a full match */        if (dval) {            local_dval = zend_strtod(str, &ptr);        } else if (allow_errors != 1 && dp_or_e != -1) {            dp_or_e = (*ptr++ == '.') ? 1 : 2;            goto check_digits;        }    } else {        return 0;    }    if (ptr != str + length) {        if (!allow_errors) {            return 0;        }        if (allow_errors == -1) {            zend_error(E_NOTICE, "A non well formed numeric value encountered");        }    }    if (type == IS_LONG) {        if (digits == MAX_LENGTH_OF_LONG - 1) {            int cmp = strcmp(&ptr[-digits], long_min_digits);            if (!(cmp < 0 || (cmp == 0 && *str == '-'))) {                if (dval) {                    *dval = zend_strtod(str, NULL);                }                if (oflow_info != NULL) {                    *oflow_info = *str == '-' ? -1 : 1;                }                return IS_DOUBLE;            }        }        if (lval) {            if (neg) {                tmp_lval = -tmp_lval;            }            *lval = tmp_lval;        }        return IS_LONG;    } else {        if (dval) {            *dval = local_dval;        }        return IS_DOUBLE;    }}/* }}} */

上面的代码是判断字符串是否为“数字”形式的比较函数。

SORT_STRING

然后我们看看第二条调用链:SORT_STRING -> php_get_data_compare_func -> php_array_data_compare_string -> string_compare_function;

php-src.git/Zend/zend_operators.c, 1729:

ZEND_API int ZEND_FASTCALL string_compare_function(zval *op1, zval *op2) /* {{{ */{    if (EXPECTED(Z_TYPE_P(op1) == IS_STRING) &&        EXPECTED(Z_TYPE_P(op2) == IS_STRING)) {        if (Z_STR_P(op1) == Z_STR_P(op2)) {            return 0;        } else {            return zend_binary_strcmp(Z_STRVAL_P(op1), Z_STRLEN_P(op1), Z_STRVAL_P(op2), Z_STRLEN_P(op2));        }    } else {        zend_string *str1 = zval_get_string(op1);        zend_string *str2 = zval_get_string(op2);        int ret = zend_binary_strcmp(ZSTR_VAL(str1), ZSTR_LEN(str1), ZSTR_VAL(str2), ZSTR_LEN(str2));        zend_string_release(str1);        zend_string_release(str2);        return ret;    }}/* }}} */

可以看到,SORT_STRING 最终也使用 zend_binary_strcmp 函数进行字符串比较。下面的代码,是 zend_binary_strcmp 的实现:

php-src.git/Zend/zend_operators.c, 2539:

ZEND_API int ZEND_FASTCALL zend_binary_strcmp(const char *s1, size_t len1, const char *s2, size_t len2) /* {{{ */{    int retval;    if (s1 == s2) {        return 0;    }    retval = memcmp(s1, s2, MIN(len1, len2));    if (!retval) {        return (int)(len1 - len2);    } else {        return retval;    }}/* }}} */

总结

经过以上分析我们可以得知,SORT_STRING 排序方式的底层实现是 C 语言的 memcmp,也就是说,它对两个字符串从前往后,按照逐个字节比较,一旦字节有差异,就终止并比较出大小。

而 SORT_REGULAR 会智能判断需排序对象的类型,如果两个字符串都是“纯数字”形式的字符串,会以比较整个字符串所代表的十进制整数、浮点数大小的形式进行排序。如果两个字符串不是“纯数字“形式的,才会和 SORT_STRING 一样。

因此,如果需要以字符串 strcmp 方式逐个字节从前往后比较来进行排序,在调用 php 的 sort 类函数的时候请务必使用 SORT_STRING 这个 flag,否则如果两个字符串都是”纯数字“形式的,就会按照它们所代表的数字大小进行排序。

而且需要注意的是,如果两个值的类型不同,那么这样的比较是毫无意义的,也可能会产生意想不到的结果。

最后的测试

最后,我们在欲排序的值最后添加了一个字符 "s",使它们不再是”纯数字“形式的字符串:

<?php    $arr = [        "nonce_str" => "441469s",        "timestamp" => "1464334314s"    ];    asort($arr);    var_dump($arr);?>

最后排序的结果变成了:

array(2) {    ["timestamp"]=>    string(11) "1464334314s"    ["nonce_str"]=>    string(7) "441469s"}

这才是我们想要的结果。

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
How does PHP type hinting work, including scalar types, return types, union types, and nullable types?How does PHP type hinting work, including scalar types, return types, union types, and nullable types?Apr 17, 2025 am 12:25 AM

PHP type prompts to improve code quality and readability. 1) Scalar type tips: Since PHP7.0, basic data types are allowed to be specified in function parameters, such as int, float, etc. 2) Return type prompt: Ensure the consistency of the function return value type. 3) Union type prompt: Since PHP8.0, multiple types are allowed to be specified in function parameters or return values. 4) Nullable type prompt: Allows to include null values ​​and handle functions that may return null values.

How does PHP handle object cloning (clone keyword) and the __clone magic method?How does PHP handle object cloning (clone keyword) and the __clone magic method?Apr 17, 2025 am 12:24 AM

In PHP, use the clone keyword to create a copy of the object and customize the cloning behavior through the \_\_clone magic method. 1. Use the clone keyword to make a shallow copy, cloning the object's properties but not the object's properties. 2. The \_\_clone method can deeply copy nested objects to avoid shallow copying problems. 3. Pay attention to avoid circular references and performance problems in cloning, and optimize cloning operations to improve efficiency.

PHP vs. Python: Use Cases and ApplicationsPHP vs. Python: Use Cases and ApplicationsApr 17, 2025 am 12:23 AM

PHP is suitable for web development and content management systems, and Python is suitable for data science, machine learning and automation scripts. 1.PHP performs well in building fast and scalable websites and applications and is commonly used in CMS such as WordPress. 2. Python has performed outstandingly in the fields of data science and machine learning, with rich libraries such as NumPy and TensorFlow.

Describe different HTTP caching headers (e.g., Cache-Control, ETag, Last-Modified).Describe different HTTP caching headers (e.g., Cache-Control, ETag, Last-Modified).Apr 17, 2025 am 12:22 AM

Key players in HTTP cache headers include Cache-Control, ETag, and Last-Modified. 1.Cache-Control is used to control caching policies. Example: Cache-Control:max-age=3600,public. 2. ETag verifies resource changes through unique identifiers, example: ETag: "686897696a7c876b7e". 3.Last-Modified indicates the resource's last modification time, example: Last-Modified:Wed,21Oct201507:28:00GMT.

Explain secure password hashing in PHP (e.g., password_hash, password_verify). Why not use MD5 or SHA1?Explain secure password hashing in PHP (e.g., password_hash, password_verify). Why not use MD5 or SHA1?Apr 17, 2025 am 12:06 AM

In PHP, password_hash and password_verify functions should be used to implement secure password hashing, and MD5 or SHA1 should not be used. 1) password_hash generates a hash containing salt values ​​to enhance security. 2) Password_verify verify password and ensure security by comparing hash values. 3) MD5 and SHA1 are vulnerable and lack salt values, and are not suitable for modern password security.

PHP: An Introduction to the Server-Side Scripting LanguagePHP: An Introduction to the Server-Side Scripting LanguageApr 16, 2025 am 12:18 AM

PHP is a server-side scripting language used for dynamic web development and server-side applications. 1.PHP is an interpreted language that does not require compilation and is suitable for rapid development. 2. PHP code is embedded in HTML, making it easy to develop web pages. 3. PHP processes server-side logic, generates HTML output, and supports user interaction and data processing. 4. PHP can interact with the database, process form submission, and execute server-side tasks.

PHP and the Web: Exploring its Long-Term ImpactPHP and the Web: Exploring its Long-Term ImpactApr 16, 2025 am 12:17 AM

PHP has shaped the network over the past few decades and will continue to play an important role in web development. 1) PHP originated in 1994 and has become the first choice for developers due to its ease of use and seamless integration with MySQL. 2) Its core functions include generating dynamic content and integrating with the database, allowing the website to be updated in real time and displayed in personalized manner. 3) The wide application and ecosystem of PHP have driven its long-term impact, but it also faces version updates and security challenges. 4) Performance improvements in recent years, such as the release of PHP7, enable it to compete with modern languages. 5) In the future, PHP needs to deal with new challenges such as containerization and microservices, but its flexibility and active community make it adaptable.

Why Use PHP? Advantages and Benefits ExplainedWhy Use PHP? Advantages and Benefits ExplainedApr 16, 2025 am 12:16 AM

The core benefits of PHP include ease of learning, strong web development support, rich libraries and frameworks, high performance and scalability, cross-platform compatibility, and cost-effectiveness. 1) Easy to learn and use, suitable for beginners; 2) Good integration with web servers and supports multiple databases; 3) Have powerful frameworks such as Laravel; 4) High performance can be achieved through optimization; 5) Support multiple operating systems; 6) Open source to reduce development costs.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
1 months agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
1 months agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
1 months agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Chat Commands and How to Use Them
1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

SecLists

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

PhpStorm Mac version

PhpStorm Mac version

The latest (2018.2.1) professional PHP integrated development tool

DVWA

DVWA

Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

Dreamweaver Mac version

Dreamweaver Mac version

Visual web development tools

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools