search
HomeBackend DevelopmentPHP TutorialDate regular expression summary
Date regular expression summaryNov 29, 2017 am 09:55 AM
Summarizeregularexpression

As a PHP programmer, we are inevitably in contact with date regular expressions. So as a programmer, how many solutions do you have for date regular expressions? In this article, we will take you to learn more about date regular expressions.

1 Overview

Date regularization is generally used when there are format requirements and the data is not directly input by the user. Due to different application scenarios, the regular rules written are also different, and the complexity is naturally different. Regular writing requires detailed analysis based on specific situations. A basic principle is: only write appropriate ones, not complex ones.

For date extraction, as long as it can be distinguished from non-dates, just write the simplest regular expression, such as

\d{4}-\d{2}-\d{2}

If the yyyy-MM-dd format can be uniquely located in the source string Date can be used for extraction.

For verification, it doesn’t make much sense if you just verify the character composition and format. You must also add verification of the rules. Due to the existence of leap years, date verification becomes more complicated.

Let’s first examine the valid range of dates and what leap years are.

2 Date rules

2.1 Valid range of date

The valid range of date will be different in different application scenarios.

The valid range of the DateTime object defined in MSDN is: 0001-01-01 00:00:00 to 9999-12-31 23:59:59.

The 0 of the UNIX timestamp is: 1970-01-01T00:00:00Z according to the ISO 8601 specification.

In actual applications, the date range will basically not exceed the range specified by DateTime, so regular verification can only use the commonly used date range.

2.2 What is a leap year

(The following is taken from Baidu Encyclopedia)

Leap year is to make up for the discrepancy between the number of days in the year and the reality of the earth caused by the artificial calendar regulations. It is established based on the time difference of the revolution period. The year in which the time difference is made up is a leap year.

The Earth’s orbit around the sun is 365 days, 5 hours, 48 ​​minutes and 46 seconds (365.24219 days), which is a tropical year year). The ordinary year of the Gregorian calendar only has 365 days, which is about 0.2422 shorter than the tropical year. Days accumulate about one day every four years. This day is added to the end of February (i.e. February 29), so that the length of the year becomes 366 days. This year is a leap year.

It should be noted that the current Gregorian calendar is adapted from the "Julian Calendar" of the Romans. Because we did not realize at the time that an extra 0.0078 days had to be calculated every year, from 46 BC to the 16th century, there were a total of 10 extra days. For this reason, Pope Gregory XIII at the time artificially designated October 5, 1582 as October 15. And started the new leap year regulations. That is to say, it is stipulated that the Gregorian calendar year is a hundred, and it must be a multiple of 400 to be a leap year, and if it is not a multiple of 400, it is an ordinary year. For example, 1700, 1800 and 1900 are ordinary years, and 2000 is a leap year. Since then, the average annual length has been 365.2425 days, with a deviation of 1 day in about 4 years. According to the calculation of a leap year every four years, an average of 0.0078 days will be added every year. After four hundred years, there will be about three extra days. Therefore, three leap years will be reduced every four hundred years. The calculation of leap years can be summed up as usual: there is a leap every four years; there will be no leaps every hundred years, and there will be leaps every four hundred years.

2.3 Date format

According to different languages ​​and cultures, the date hyphens will be different, usually in the following formats:

yyyyMMdd

yyyy-MM-dd

yyyy/MM/dd

yyyy.MM.dd

3 ​ Date regular expression construction

3.1 Rule analysis

A common way to write complex regular expressions is to first separate the irrelevant requirements and write the corresponding requirements separately. Regularize, then combine, check the mutual correlation and influence, and basically you can get the corresponding regular.

According to the definition of leap year, dates can be classified in several ways.

3.1.1 According to whether the number of days is related to the year, it is divided into two categories

The category is independent of the year, and according to the number of days in each month, it can be further divided into two categories

1, 3, 5, July, August, October, and December are from 1 to 31 days

4, June, September, and November are from 1 to 30 days

In a category related to the year

February in ordinary years is from 1 to 28th

February in leap years is from 1 to 29th

All months in all years include days from 1 to 28th

All years except 2 All months outside include the 29th and 30th

All months in January, March, May, July, August, October and December contain the 31st

February in leap years contains the 29th

3.1.2 It can be divided into four categories according to different dates

3.1.3 The classification method is chosen

because the implementation after date classification is achieved through the branch structure of (exp1|exp2|exp3), and the branch structure starts from the left branch and tries to match in sequence to the right. When When a branch matches successfully, it will no longer try to the right, otherwise it will try all branches and report failure.

The number of branches and the complexity of each branch will affect the matching efficiency. Considering the probability distribution of the verified dates, most of them fall within 1-28 days, so using the second classification method will Effectively improve matching efficiency.

3.2 Regular implementation

Using the classification method in Section 3.1.2, the corresponding regular rules can be written for each rule. The following is temporarily implemented in the MM-dd format.

先考虑与年份无关的前三条规则,年份可统一写作

(?!0000)[0-9]{4}

下面仅考虑月和日的正则

包括平年在内的所有年份的月份都包含1-28日

(0[1-9]|1[0-2])-(0[1-9]|1[0-9]|2[0-8])

包括平年在内的所有年份除2月外都包含29和30日

(0[13-9]|1[0-2])-(29|30)

包括平年在内的所有年份1、3、5、7、8、10、12月都包含31日

(0[13578]|1[02])-31)

合起来就是除闰年的2月29日外的其它所有日期

(?!0000)[0-9]{4}-((0[1-9]|1[0-2])-(0[1-9]|1[0-9]|2[0-8])|(0[13-9]|1[0-2])-(29|30)|(0[13578]|1[02])-31)


接下来考虑闰年的实现

闰年2月包含29日

这里的月和日是固定的,就是02-29,只有年是变化的。

可通过以下代码输出所有的闰年年份,考察规则

for (int i = 1; i < 10000; i++){
  if ((i % 4 == 0 && i % 100 != 0) || i % 400 == 0){
   richTextBox2.Text += string.Format("{0:0000}", i) + "\n";
  }
}

根据闰年的规则,很容易整理出规则,四年一闰;

([0-9]{2}(0[48]|[2468][048]|[13579][26])

百年不闰,四百年再闰。

(0[48]|[2468][048]|[13579][26])00

合起来就是所有闰年的2月29日

([0-9]{2}(0[48]|[2468][048]|[13579][26])|(0[48]|[2468][048]|[13579][26])00)-02-29)

四条规则都已实现,且互相间没有影响,合起来就是所有符合DateTime范围的日期的正则

^((?!0000)[0-9]{4}-((0[1-9]|1[0-2])-(0[1-9]|1[0-9]|2[0-8])|(0[13-9]|1[0-2])-(29|30)|(0[13578]|1[02])-31)|([0-9]{2}(0[48]|[2468]
[048]|[13579][26])|(0[48]|[2468][048]|[13579][26])00)-02-29)$


考虑到这个正则表达式仅仅是用作验证,所以捕获组没有意义,只会占用资源,影响匹配效率,所以可以使用非捕获组来进行优化。

^(?:(?!0000)[0-9]{4}-(?:(?:0[1-9]|1[0-2])-(?:0[1-9]|1[0-9]|2[0-8])|(?:0[13-9]|1[0-2])-(?:29|30)|(?:0[13578]|1[02])-31)|(?:[0-9]{2}(?:0[48]|[2468]
[048]|[13579][26])|(?:0[48]|[2468][048]|[13579][26])00)-02-29)$

以上正则年份0001-9999,格式yyyy-MM-dd。可以通过以下代码验证正则的有效性和性能

DateTime dt = new DateTime(1, 1, 1);
DateTime endDay = new DateTime(9999, 12, 31);
Stopwatch sw = new Stopwatch();
sw.Start();
   
Regex dateRegex = new Regex(@"^(?:(?!0000)[0-9]{4}-(?:(?:0[1-9]|1[0-2])-(?:0[1-9]|1[0-9]|2[0-8])|(?:0[13-9]|1[0-2])-(?:29|30)|(?:0[13578]|1[02])-31)|(?:[0-9]{2}(?:0[48]|[2468][048]|[13579][26])|(?:0[48]|[2468][048]|[13579][26])00)-02-29)$");
   
//Regex dateRegex = new Regex(@"^((?!0000)[0-9]{4}-((0[1-9]|1[0-2])-(0[1-9]|1[0-9]|2[0-8])|(0[13-9]|1[0-2])-(29|30)|(0[13578]|1[02])-31)|([0-9]{2}(0[48]|[2468][048]|[13579][26])|(0[48]|[2468][048]|[13579][26])00)-02-29)$");
Console.WriteLine("开始日期: " + dt.ToString("yyyy-MM-dd"));
while (dt < endDay){
  if (!dateRegex.IsMatch(dt.ToString("yyyy-MM-dd"))){
   Console.WriteLine(dt.ToString("yyyy-MM-dd") + " false");
  }
  dt = dt.AddDays(1);
}
if (!dateRegex.IsMatch(dt.ToString("yyyy-MM-dd"))){
  Console.WriteLine(dt.ToString("yyyy-MM-dd") + " false");
}
Console.WriteLine("结束日期: " + dt.ToString("yyyy-MM-dd"));
sw.Stop();
Console.WriteLine("测试用时: " + sw.ElapsedMilliseconds + "ms");
Console.WriteLine("测试完成!");
Console.ReadLine();

   


4       日期正则表达式扩展

4.1     “年月日”形式扩展

  以上实现的是yyyy-MM-dd格式的日期验证,考虑到连字符的不同,以及月和日可能为M和d,即yyyy-M-d的格式,可以对以上正则进行扩展

^(?:(?!0000)[0-9]{4}([-/.]?)(?:(?:0?[1-9]|1[0-2])([-/.]?)(?:0?[1-9]|1[0-9]|2[0-8])|(?:0?[13-9]|1[0-2])([-/.]?)(?:29|30)|(?:0?[13578]|1[02])
([-/.]?)31)|(?:[0-9]{2}(?:0[48]|[2468][048]|[13579][26])|(?:0[48]|[2468][048]|[13579][26])00)([-/.]?)0?2([-/.]?)29)$

   


  使用反向引用进行简化,年份0001-9999,格式yyyy-MM-dd或yyyy-M-d,连字符可以没有或是“-”、“/”、“.”之一。

^(?:(?!0000)[0-9]{4}([-/.]?)(?:(?:0?[1-9]|1[0-2])\1(?:0?[1-9]|1[0-9]|2[0-8])|(?:0?[13-9]|1[0-2])\1(?:29|30)|
(?:0?[13578]|1[02])\1(?:31))|(?:[0-9]{2}(?:0[48]|[2468][048]|[13579][26])|(?:0[48]|[2468][048]|[13579][26])00)([-/.]?)0?2\2(?:29))$


  这就是“年月日”这种形式最全的一个正则了,不同含义部分以不同颜色标识,可以根据自己的需要进行栽剪。

4.2     其它形式扩展

  了解了以上正则各部分代表的含义,互相间的关系后,就很容易扩展成其它格式的日期正则,如dd/MM/yyyy这种“日月年”格式的日期。

^(?:(?:(?:0?[1-9]|1[0-9]|2[0-8])([-/.]?)(?:0?[1-9]|1[0-2])|(?:29|30)([-/.]?)(?:0?[13-9]|1[0-2])|31([-/.]?)
(?:0?[13578]|1[02]))([-/.]?)(?!0000)[0-9]{4}|29([-/.]?)0?2([-/.]?)
(?:[0-9]{2}(?:0[48]|[2468][048]|[13579][26])|(?:0[48]|[2468][048]|[13579][26])00))$

   


  这种格式需要注意的就是不能用反向引用来进行优了。连字符等可根据自己的需求栽剪。

4.3     添加时间的扩展

  时间的规格很明确,也很简单,基本上就HH:mm:ss和H:m:s两种形式。

([01][0-9]|2[0-3]):[0-5][0-9]:[0-5][0-9]

合入到日期的正则中,yyyy-MM-dd HH:mm:ss

^(?:(?!0000)[0-9]{4}-(?:(?:0[1-9]|1[0-2])-(?:0[1-9]|1[0-9]|2[0-8])|(?:0[13-9]|1[0-2])-(?:29|30)|
(?:0[13578]|1[02])-31)|(?:[0-9]{2}(?:0[48]|[2468][048]|[13579][26])|(?:0[48]|[2468][048]|[13579]
[26])00)-02-29)\s+([01][0-9]|2[0-3]):[0-5][0-9]:[0-5][0-9]$

  

4.4     年份定制

  以上所有涉及到平年的年份里,使用的是0001-9999。当然,年份也可以根据闰年规则定制。

  如年份1600-9999,格式yyyy-MM-dd或yyyy-M-d,连字符可以没有或是“-”、“/”、“.”之一。

^(?:(?:1[6-9]|[2-9][0-9])[0-9]{2}([-/.]?)(?:(?:0?[1-9]|1[0-2])\1(?:0?[1-9]|1[0-9]|2[0-8])|
(?:0?[13-9]|1[0-2])\1(?:29|30)|(?:0?[13578]|1[02])\1(?:31))|(?:(?:1[6-9]|[2-9][0-9])
(?:0[48]|[2468][048]|[13579][26])|(?:16|[2468][048]|[3579][26])00)([-/.]?)0?2\2(?:29))$

5       特别说明

  以上正则采用的是最基本的正则语法规则,绝大多数采用传统NFA引擎的语言都可以支持,包括JavaScript、Java、.NET等。

  另外需求说明的是,虽然日期的规则相对明确,可以采用这种方式裁剪来得到符合要求的日期正则,但是并不推荐这样使用正则,正则的强大在于它的灵活性,可以根据需求,量身打造最合适的正则,如果只是用来套用模板,那正则也就不称其为正则了。

  正则的语法规则并不多,而且很容易入门,掌握语法规则,量体裁衣,才是正则之“道”。

6       应用

一、首先看需求

  日期的输入:

  手动输入,可输入两种格式yyyymmdd或yyyy-mm-dd

二、解决思路

用户手动输入日期,需要验证输入的日期格式

用户可能的输入情况可以分为以下几种:

(1).输入为空或者为空格

(2).输入非日期格式

  根据保存到数据库中的日期格式,保存的格式为yyyy-mm-dd,所以用户在输入yyyymmdd后需要进行转换,转换成yyyy-mm-dd。

  思路:

  验证日期格式,首现想到的是VS的验证控件,但是因为需要验证的控件有几十个,使用验证控件就需要一个个的拉控件,如果后期需要修改也很麻烦,而通过JS实现控制,再通过正则表达式对日期进行验证。

三、JS实现

//验证日期
function date(id) {
  var idvalue = document.getElementById(id).value;  //通过查找元素
  var tmpStr = "";
  var strReturn = "";
  //调用trim()去掉空格,因为js不支持trim()
  var iIdNo = trim(idvalue);
  //正则表达式,判断日期格式,包括日期的界限,日期的格式,平年和闰年
  var v = idvalue.match(/^((((1[6-9]|[2-9]\d)\d{2})-(0?[13578]|1[02])-(0?[1-9]|[12]\d|3[01]))|(((1[6-9]|[2-9]\d)\d{2})-(0?[13456789]|1[012])-(0?[1-9]|[12]\d|30))|(((1[6-9]|[2-9]\d)\d{2})-0?2-(0?[1-9]|1\d|2[0-8]))|(((1[6-9]|[2-9]\d)(0[48]|[2468][048]|[13579][26])|((16|[2468][048]|[3579][26])00))-0?2-29-))$/);
  //输入为空时跳过检测
  if (iIdNo.length == 0) {
    return false;
  }
  //自动更改日期格式为yyyy-mm-dd
  if (iIdNo.length == 8) {
    tmpStr = iIdNo.substring(0, 8);
    tmpStr = tmpStr.substring(0, 4) + "-" + tmpStr.substring(4, 6) + "-" + tmpStr.substring(6, 8)
    document.getElementById(id).value = tmpStr;
    document.getElementById(id).focus();
  }
  //验证,判断日期格式
  if ((iIdNo.length != 8) && !v) {
    strReturn = "日期格式错误,提示:19990101或1999-01-01";
    alert(strReturn);
    document.getElementById(id).select();
    return false;
  }
}
//运用正则表达式去除字符串两端空格(因为js不支持trim()) 
function trim(str) {
  return str.replace(/(^\s*)|(\s*$)/g, "");
}
//前台调用(获得焦点触发)
<input class="txtenterschooldate" size="14" type="text" id="txtenterschooldate" name="txtenterschooldate" onblur="date(&#39;txtenterschooldate&#39;)"/>

以上内容就是关于日期正则表达式的思路详解,如果大家觉得有用那就赶紧收藏起来吧。

相关推荐:

日期正则表达式详解

关于日期正则表达式解决思路

php date regular expression


The above is the detailed content of Date regular expression summary. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
总结Linux系统中system()函数的用法总结Linux系统中system()函数的用法Feb 23, 2024 pm 06:45 PM

Linux下system()函数的总结在Linux系统中,system()函数是一个非常常用的函数,它可以用于执行命令行命令。本文将对system()函数进行详细的介绍,并提供一些具体的代码示例。一、system()函数的基本用法system()函数的声明如下:intsystem(constchar*command);其中,command参数是一个字符

如何用php正则替换以什么开头的字符串如何用php正则替换以什么开头的字符串Mar 24, 2023 pm 02:57 PM

PHP正则表达式是一种针对文本处理和转换的有力工具。它可以通过解析文本内容,并按照特定的模式进行替换或截取,达到有效管理文本信息的目的。其中,正则表达式的一个常见应用是替换以特定字符开头的字符串,对此,我们进行如下的讲解

如何用 Golang 正则匹配多个单词或字符串?如何用 Golang 正则匹配多个单词或字符串?May 31, 2024 am 10:32 AM

Golang正则表达式使用管道符|来匹配多个单词或字符串,将各个选项作为逻辑OR表达式分隔开来。例如:匹配"fox"或"dog":fox|dog匹配"quick"、"brown"或"lazy":(quick|brown|lazy)匹配"Go"、"Python"或"Java":Go|Python|Java匹配单词或4位邮政编码:([a-zA

php 如何用正则去除中文php 如何用正则去除中文Mar 03, 2023 am 10:12 AM

php用正则去除中文的方法:1、创建一个php示例文件;2、定义一个含有中文和英文的字符串;3、通过“preg_replace('/([\x80-\xff]*)/i','',$a);”正则方法去除查询结果中的中文字符即可。

php怎么利用正则匹配去掉html标签php怎么利用正则匹配去掉html标签Mar 21, 2023 pm 05:17 PM

在本文中,我们将学习如何使用PHP正则表达式删除HTML标签,并从HTML字符串中提取纯文本内容。 为了演示如何去掉HTML标记,让我们首先定义一个包含HTML标签的字符串。

如何解决Python的表达式语法错误?如何解决Python的表达式语法错误?Jun 24, 2023 pm 05:04 PM

Python作为一种高级编程语言,易于学习和使用。一旦需要编写Python程序时,无法避免地遇到语法错误,表达式语法错误是常见的一种。在本文中,我们将讨论如何解决Python的表达式语法错误。表达式语法错误是Python中最常见的错误之一,它通常是由于错误的使用语法或缺少必要组件而导致的。在Python中,表达式通常由数字、字符串、变量和运算符组成。最常见的

使用PHP正则实现中文替换功能的技巧分享使用PHP正则实现中文替换功能的技巧分享Mar 24, 2024 pm 05:57 PM

使用PHP正则实现中文替换功能的技巧分享在web开发中,经常会遇到需要对中文内容进行替换的情况。PHP作为一种流行的服务器端脚本语言,提供了强大的正则表达式功能,可以很方便地实现中文替换。本文将分享一些在PHP中使用正则实现中文替换的技巧,同时提供具体的代码示例。1.使用preg_replace函数实现中文替换PHP中的preg_replace函数可以用来

在C和C++中,逗号(comma)的用法是用来分隔表达式或语句在C和C++中,逗号(comma)的用法是用来分隔表达式或语句Sep 09, 2023 pm 05:33 PM

在C或C++中,逗号“,”有不同的用途。在这里我们将了解如何使用它们。逗号作为运算符。逗号运算符是一个二元运算符,它计算第一个操作数,然后丢弃结果,然后计算第二个操作数并返回值。逗号运算符在C或C++中的优先级最低。示例#include<stdio.h>intmain(){&nbsp;&nbsp;intx=(50,60);&nbsp;&nbsp;inty=(func1(),func2());}这里60将被分配给x。对于下一条语句,将首先执行func1(

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
2 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
Repo: How To Revive Teammates
1 months agoBy尊渡假赌尊渡假赌尊渡假赌
Hello Kitty Island Adventure: How To Get Giant Seeds
4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

mPDF

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

SublimeText3 Linux new version

SublimeText3 Linux new version

SublimeText3 Linux latest version

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

PhpStorm Mac version

PhpStorm Mac version

The latest (2018.2.1) professional PHP integrated development tool

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools