Home >php教程 >php手册 >一个“日期”字符串进行比较的case

一个“日期”字符串进行比较的case

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB
WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOriginal
2016-06-13 09:37:301004browse

项目中有个功能是比较会员是否过期,review同事的代码,发现其写法比较奇葩,但线上竟也未出现bug。

实现大致如下:

$expireTime = "2014-05-01 00:00:00";
$currentTime = date('Y-m-d H:i:s', time());

if($currentTime < $expireTime) {
    return false;
} else {
    return true;
}

如果两个时间需要进行比较,通常是转换成unix时间戳,用两个int型的数字进行比较。该实现却特意将时间表示成string,然后对两个string进行比较运算。

撇开写法不谈,我很好奇的是php内部是如何进行比较的。

闲话少说,还是从源码开始跟踪。

编译期

在zend_language_parse.y中可以发现类似下述语法:

<span expr</span> === expr    { zend_do_binary_op(ZEND_IS_IDENTICAL, &$$, &$<span 1</span>, &$<span 3</span><span  TSRMLS_CC); }
</span><span expr</span> !== <span expr</span>    { zend_do_binary_op(ZEND_IS_NOT_IDENTICAL, &$$, &$<span 1</span>, &$<span 3</span><span  TSRMLS_CC); }
</span><span expr</span> ==  <span expr</span>    { zend_do_binary_op(ZEND_IS_EQUAL, &$$, &$<span 1</span>, &$<span 3</span><span  TSRMLS_CC); }
</span><span expr</span> !=  <span expr</span>    { zend_do_binary_op(ZEND_IS_NOT_EQUAL, &$$, &$<span 1</span>, &$<span 3</span><span  TSRMLS_CC); }
</span><span expr</span> <   <span expr</span>    { zend_do_binary_op(ZEND_IS_SMALLER, &$$, &$<span 1</span>, &$<span 3</span><span  TSRMLS_CC); }
</span><span expr</span> <=  <span expr</span>    { zend_do_binary_op(ZEND_IS_SMALLER_OR_EQUAL, &$$, &$<span 1</span>, &$<span 3</span><span  TSRMLS_CC); }
</span><span expr</span> >   <span expr</span>    { zend_do_binary_op(ZEND_IS_SMALLER, &$$, &$<span 3</span>, &$<span 1</span><span  TSRMLS_CC); }
</span><span expr</span> >=  <span expr</span>    { zend_do_binary_op(ZEND_IS_SMALLER_OR_EQUAL, &$$, &$<span 3</span>, &$<span 1</span> TSRMLS_CC); }

很明显,此处编译成opcode用的便是zend_do_binary_op。

void zend_do_binary_op(zend_uchar op, znode *result, const znode *op1, const znode *op2 TSRMLS_DC) /* {{{ */
{
	zend_op *opline = get_next_op(CG(active_op_array) TSRMLS_CC);

	opline->opcode = op;
	opline->result.op_type = IS_TMP_VAR;
	opline->result.u.var = get_temporary_variable(CG(active_op_array));
	opline->op1 = *op1;
	opline->op2 = *op2;
	*result = opline->result;
}

该函数并没有做什么特别的处理,仅仅是简单保存了opcode、操作数1和操作数2。

执行期

根据opcode,跳转到相应的处理函数:ZEND_IS_SMALLER_SPEC_CONST_CONST_HANDLER。

static int ZEND_FASTCALL  ZEND_IS_SMALLER_SPEC_CONST_CONST_HANDLER(ZEND_OPCODE_HANDLER_ARGS)
{
	zend_op *opline = EX(opline);

	zval *result = &EX_T(opline->result.u.var).tmp_var;

	compare_function(result,
		&opline->op1.u.constant,
		&opline->op2.u.constant TSRMLS_CC);
	ZVAL_BOOL(result, (Z_LVAL_P(result) < 0));


	ZEND_VM_NEXT_OPCODE();
}

注意到,两个zval的比较是利用compare_function来处理。

ZEND_API int compare_function(zval *result, zval *op1, zval *op2 TSRMLS_DC) /* {{{ */
{
	int ret;
	int converted = 0;
	zval op1_copy, op2_copy;
	zval *op_free;

	while (1) {
		switch (TYPE_PAIR(Z_TYPE_P(op1), Z_TYPE_P(op2))) {
			case TYPE_PAIR(IS_LONG, IS_LONG):
				...
			case TYPE_PAIR(IS_DOUBLE, IS_LONG):
				...
			case TYPE_PAIR(IS_DOUBLE, IS_DOUBLE):
				...
			...
			// 两个字符串进行比较
			case TYPE_PAIR(IS_STRING, IS_STRING):
				zendi_smart_strcmp(result, op1, op2);
				return SUCCESS;
			...
		}
	}
}

该函数例举了若干种情况,根据本文case,进入zendi_smart_strcmp一窥究竟:

ZEND_API void zendi_smart_strcmp(zval *result, zval *s1, zval *s2) /* {{{ */
{
	int ret1, ret2;
	long lval1, lval2;
	double dval1, dval2;

	// 尝试将字符串转成数字类型
	if ((ret1=is_numeric_string(Z_STRVAL_P(s1), Z_STRLEN_P(s1), &lval1, &dval1, 0)) &&
		(ret2=is_numeric_string(Z_STRVAL_P(s2), Z_STRLEN_P(s2), &lval2, &dval2, 0))) {
		// 进行数字之间的比较
		...
	} else {
	    // 无法全部转成数字
	    // 则调用zend_binary_zval_strcmp
	    // 本质为memcmp的一层封装
		Z_LVAL_P(result) = zend_binary_zval_strcmp(s1, s2);
		ZVAL_LONG(result, ZEND_NORMALIZE_BOOL(Z_LVAL_P(result)));
	}
}

那么“2014-05-01 00:00:00”能否转化成数字么?

还是得看下is_numeric_string的实现规则。

static inline zend_uchar is_numeric_string(const char *str, int length, long *lval, double *dval, int allow_errors)
{
	const char *ptr;
	int base = 10, digits = 0, dp_or_e = 0;
	double local_dval;
	zend_uchar type;

	if (!length) {
		return 0;
	}

	/* trim掉字符串开头的空白部分 */
	while (*str == ' ' || *str == '\t' || *str == '\n' || *str == '\r' || *str == '\v' || *str == '\f') {
		str++;
		length--;
	}
	ptr = str;

	if (*ptr == '-' || *ptr == '+') {
		ptr++;
	}

	if (ZEND_IS_DIGIT(*ptr)) {
		/* 判断是否为16进制	*/
		if (length > 2 && *str == '0' && (str[1] == 'x' || str[1] == 'X')) {
			base = 16;
			ptr += 2;
		}

		/* 忽略后续的若干0 */
		while (*ptr == '0') {
			ptr++;
		}

		/* 计算数字的位数,并决定是整型还是浮点 */
		for (type = IS_LONG; !(digits >= MAX_LENGTH_OF_LONG && (dval || allow_errors == 1)); digits++, ptr++) {
check_digits:
			if (ZEND_IS_DIGIT(*ptr) || (base == 16 && ZEND_IS_XDIGIT(*ptr))) {
				continue;
			} else if (base == 10) {
				if (*ptr == '.' && dp_or_e < 1) {
					goto process_double;
				} else if ((*ptr == 'e' || *ptr == 'E') && dp_or_e < 2) {
					const char *e = ptr + 1;

					if (*e == '-' || *e == '+') {
						ptr = e++;
					}
					if (ZEND_IS_DIGIT(*e)) {
						goto process_double;
					}
				}
			}

			break;
		}

		if (base == 10) {
			if (digits >= MAX_LENGTH_OF_LONG) {
				dp_or_e = -1;
				goto process_double;
			}
		} else if (!(digits < SIZEOF_LONG * 2 || (digits == SIZEOF_LONG * 2 && ptr[-digits] <= '7'))) {
			if (dval) {
				local_dval = zend_hex_strtod(str, (char **)&ptr);
			}
			type = IS_DOUBLE;
		}
	} else if (*ptr == '.' && ZEND_IS_DIGIT(ptr[1])) {
		// 处理浮点数
	} else {
		return 0;
	}

	// 如果不允许容错,则报错退出
	if (ptr != str + length) {
		if (!allow_errors) {
			return 0;
		}
		if (allow_errors == -1) {
			zend_error(E_NOTICE, "A non well formed numeric value encountered");
		}
	}

	// 允许容错,则尝试将str转成数字
	if (type == IS_LONG) {
		if (digits == MAX_LENGTH_OF_LONG - 1) {
			int cmp = strcmp(&ptr[-digits], long_min_digits);

			if (!(cmp < 0 || (cmp == 0 && *str == '-'))) {
				if (dval) {
					*dval = zend_strtod(str, NULL);
				}

				return IS_DOUBLE;
			}
		}

		if (lval) {
			*lval = strtol(str, NULL, base);
		}

		return IS_LONG;
	} else {
		if (dval) {
			*dval = local_dval;
		}

		return IS_DOUBLE;
	}
}

代码比较长,不过仔细阅读,str转num的规则还是很清晰的。

尤其注意的是allow_errors这个参数,它直接决定了本例中无法将“2014-05-01 00:00:00”转化成数字。

因而最后其实“2014-04-17 00:00:00”

既然是memcmp,便不难理解为何文章开始提到的写法也能正确运行。

容错转换

何时allow_errors为true呢?一个极好的例子便是zend_parse_parameters,zend_parse_parameters的实现不再细述,有兴趣的读者可以自行研究。其中调用is_numeric_string时将allow_errors置为了-1。

举个例子:

static void php_date(INTERNAL_FUNCTION_PARAMETERS, int localtime)
{
	char   *format;
	int     format_len;
	long    ts;
	char   *string;

	// 期望的第二个参数为timestamp,为long
	// 假设上层调用时,误传入了string,那么zend_parse_parameters依然会尽可能的尝试将string解析为long
	if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s|l", &format, &format_len, &ts) == FAILURE) {
		RETURN_FALSE;
	}
	if (ZEND_NUM_ARGS() == 1) {
		ts = time(NULL);
	}

	string = php_format_date(format, format_len, ts, localtime TSRMLS_CC);
	
	RETVAL_STRING(string, 0);
}

这是php的date函数内部实现。

在我们调用date时,如果将第二个参数传入string,效果如下:

echo date('Y-m-d', '0-1-2');

// 输出
PHP Notice:  A non well formed numeric value encountered in Command line code on line 1
1970-01-01

虽然报出notice级别的错误,但依然成功将'0-1-2'转成了0

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn