Home > Article > Backend Development > Case_PHP tutorial for comparing a 'date' string
There is a function in the project that compares whether members have expired. I reviewed my colleague’s code and found that the way it was written was rather strange, but there were no bugs online.
The implementation is roughly as follows:
$expireTime = "2014-05-01 00:00:00"; $currentTime = date('Y-m-d H:i:s', time()); if($currentTime < $expireTime) { return false; } else { return true; }
If two times need to be compared, they are usually converted into unix timestamps and compared with two int-type numbers. This implementation specifically expresses time as a string, and then performs a comparison operation on the two strings.
Leaving aside the writing method, I am very curious about how comparison is performed internally in PHP.
Without further ado, let’s start tracking from the source code.
You can find syntax similar to the following in zend_language_parse.y:
<span expr</span> === expr { zend_do_binary_op(ZEND_IS_IDENTICAL, &$$, &$<span 1</span>, &$<span 3</span><span TSRMLS_CC); } </span><span expr</span> !== <span expr</span> { zend_do_binary_op(ZEND_IS_NOT_IDENTICAL, &$$, &$<span 1</span>, &$<span 3</span><span TSRMLS_CC); } </span><span expr</span> == <span expr</span> { zend_do_binary_op(ZEND_IS_EQUAL, &$$, &$<span 1</span>, &$<span 3</span><span TSRMLS_CC); } </span><span expr</span> != <span expr</span> { zend_do_binary_op(ZEND_IS_NOT_EQUAL, &$$, &$<span 1</span>, &$<span 3</span><span TSRMLS_CC); } </span><span expr</span> < <span expr</span> { zend_do_binary_op(ZEND_IS_SMALLER, &$$, &$<span 1</span>, &$<span 3</span><span TSRMLS_CC); } </span><span expr</span> <= <span expr</span> { zend_do_binary_op(ZEND_IS_SMALLER_OR_EQUAL, &$$, &$<span 1</span>, &$<span 3</span><span TSRMLS_CC); } </span><span expr</span> > <span expr</span> { zend_do_binary_op(ZEND_IS_SMALLER, &$$, &$<span 3</span>, &$<span 1</span><span TSRMLS_CC); } </span><span expr</span> >= <span expr</span> { zend_do_binary_op(ZEND_IS_SMALLER_OR_EQUAL, &$$, &$<span 3</span>, &$<span 1</span> TSRMLS_CC); }
Obviously, zend_do_binary_op is used to compile opcode here.
void zend_do_binary_op(zend_uchar op, znode *result, const znode *op1, const znode *op2 TSRMLS_DC) /* {{{ */ { zend_op *opline = get_next_op(CG(active_op_array) TSRMLS_CC); opline->opcode = op; opline->result.op_type = IS_TMP_VAR; opline->result.u.var = get_temporary_variable(CG(active_op_array)); opline->op1 = *op1; opline->op2 = *op2; *result = opline->result; }
This function does not do any special processing, it just simply saves the opcode, operand 1 and operand 2.
According to opcode, jump to the corresponding processing function: ZEND_IS_SMALLER_SPEC_CONST_CONST_HANDLER.
static int ZEND_FASTCALL ZEND_IS_SMALLER_SPEC_CONST_CONST_HANDLER(ZEND_OPCODE_HANDLER_ARGS) { zend_op *opline = EX(opline); zval *result = &EX_T(opline->result.u.var).tmp_var; compare_function(result, &opline->op1.u.constant, &opline->op2.u.constant TSRMLS_CC); ZVAL_BOOL(result, (Z_LVAL_P(result) < 0)); ZEND_VM_NEXT_OPCODE(); }
Note that the comparison of two zvals is handled using compare_function.
ZEND_API int compare_function(zval *result, zval *op1, zval *op2 TSRMLS_DC) /* {{{ */ { int ret; int converted = 0; zval op1_copy, op2_copy; zval *op_free; while (1) { switch (TYPE_PAIR(Z_TYPE_P(op1), Z_TYPE_P(op2))) { case TYPE_PAIR(IS_LONG, IS_LONG): ... case TYPE_PAIR(IS_DOUBLE, IS_LONG): ... case TYPE_PAIR(IS_DOUBLE, IS_DOUBLE): ... ... // 两个字符串进行比较 case TYPE_PAIR(IS_STRING, IS_STRING): zendi_smart_strcmp(result, op1, op2); return SUCCESS; ... } } }
This function exemplifies several situations. According to the case in this article, enter zendi_smart_strcmp to get a closer look:
ZEND_API void zendi_smart_strcmp(zval *result, zval *s1, zval *s2) /* {{{ */ { int ret1, ret2; long lval1, lval2; double dval1, dval2; // 尝试将字符串转成数字类型 if ((ret1=is_numeric_string(Z_STRVAL_P(s1), Z_STRLEN_P(s1), &lval1, &dval1, 0)) && (ret2=is_numeric_string(Z_STRVAL_P(s2), Z_STRLEN_P(s2), &lval2, &dval2, 0))) { // 进行数字之间的比较 ... } else { // 无法全部转成数字 // 则调用zend_binary_zval_strcmp // 本质为memcmp的一层封装 Z_LVAL_P(result) = zend_binary_zval_strcmp(s1, s2); ZVAL_LONG(result, ZEND_NORMALIZE_BOOL(Z_LVAL_P(result))); } }
Can "2014-05-01 00:00:00" be converted into a number?
You still have to look at the implementation rules of is_numeric_string.
static inline zend_uchar is_numeric_string(const char *str, int length, long *lval, double *dval, int allow_errors) { const char *ptr; int base = 10, digits = 0, dp_or_e = 0; double local_dval; zend_uchar type; if (!length) { return 0; } /* trim掉字符串开头的空白部分 */ while (*str == ' ' || *str == '\t' || *str == '\n' || *str == '\r' || *str == '\v' || *str == '\f') { str++; length--; } ptr = str; if (*ptr == '-' || *ptr == '+') { ptr++; } if (ZEND_IS_DIGIT(*ptr)) { /* 判断是否为16进制 */ if (length > 2 && *str == '0' && (str[1] == 'x' || str[1] == 'X')) { base = 16; ptr += 2; } /* 忽略后续的若干0 */ while (*ptr == '0') { ptr++; } /* 计算数字的位数,并决定是整型还是浮点 */ for (type = IS_LONG; !(digits >= MAX_LENGTH_OF_LONG && (dval || allow_errors == 1)); digits++, ptr++) { check_digits: if (ZEND_IS_DIGIT(*ptr) || (base == 16 && ZEND_IS_XDIGIT(*ptr))) { continue; } else if (base == 10) { if (*ptr == '.' && dp_or_e < 1) { goto process_double; } else if ((*ptr == 'e' || *ptr == 'E') && dp_or_e < 2) { const char *e = ptr + 1; if (*e == '-' || *e == '+') { ptr = e++; } if (ZEND_IS_DIGIT(*e)) { goto process_double; } } } break; } if (base == 10) { if (digits >= MAX_LENGTH_OF_LONG) { dp_or_e = -1; goto process_double; } } else if (!(digits < SIZEOF_LONG * 2 || (digits == SIZEOF_LONG * 2 && ptr[-digits] <= '7'))) { if (dval) { local_dval = zend_hex_strtod(str, (char **)&ptr); } type = IS_DOUBLE; } } else if (*ptr == '.' && ZEND_IS_DIGIT(ptr[1])) { // 处理浮点数 } else { return 0; } // 如果不允许容错,则报错退出 if (ptr != str + length) { if (!allow_errors) { return 0; } if (allow_errors == -1) { zend_error(E_NOTICE, "A non well formed numeric value encountered"); } } // 允许容错,则尝试将str转成数字 if (type == IS_LONG) { if (digits == MAX_LENGTH_OF_LONG - 1) { int cmp = strcmp(&ptr[-digits], long_min_digits); if (!(cmp < 0 || (cmp == 0 && *str == '-'))) { if (dval) { *dval = zend_strtod(str, NULL); } return IS_DOUBLE; } } if (lval) { *lval = strtol(str, NULL, base); } return IS_LONG; } else { if (dval) { *dval = local_dval; } return IS_DOUBLE; } }
The code is relatively long, but if you read it carefully, the rules for converting str to num are still very clear.
Pay special attention to the allow_errors parameter, which directly determines that "2014-05-01 00:00:00" cannot be converted into a number in this example.
So in the end, "2014-04-17 00:00:00" < "2014-05-01 00:00:00" actually runs through the memcmp branch.
Since it is memcmp, it is not difficult to understand why the writing method mentioned at the beginning of the article can also run correctly.
When is allow_errors true? An excellent example is zend_parse_parameters. The implementation of zend_parse_parameters will not be described in detail. Interested readers can study it by themselves. When calling is_numeric_string, allow_errors is set to -1.
For example:
static void php_date(INTERNAL_FUNCTION_PARAMETERS, int localtime) { char *format; int format_len; long ts; char *string; // 期望的第二个参数为timestamp,为long // 假设上层调用时,误传入了string,那么zend_parse_parameters依然会尽可能的尝试将string解析为long if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s|l", &format, &format_len, &ts) == FAILURE) { RETURN_FALSE; } if (ZEND_NUM_ARGS() == 1) { ts = time(NULL); } string = php_format_date(format, format_len, ts, localtime TSRMLS_CC); RETVAL_STRING(string, 0); }
This is the internal implementation of PHP’s date function.
When we call date, if the second parameter is passed into string, the effect is as follows:
echo date('Y-m-d', '0-1-2'); // 输出 PHP Notice: A non well formed numeric value encountered in Command line code on line 1 1970-01-01
Although a notice level error was reported, '0-1-2' was still successfully converted to 0