Home > Article > Backend Development > Regular expressions in PHP (2)_PHP tutorial
Determined to reoccur
By now, you already know how to match a letter or number, but more often than not, you may want to match a word or a group of numbers. A word consists of several letters, and a group of numbers consists of several singular numbers. The curly braces ({}) following a character or character cluster are used to determine the number of times the preceding content is repeated.
Character cluster meaning
^[a-zA-Z_]$ All letters and underscores
^[[:alpha:]]{3}$ All 3-letter words
^a$ letter a
^a{4}$ aaaa
^a{2,4}$ aa,aaa or aaaa
^a{1,3}$ a,aa or aaa
^a{2,}$ A string containing more than two a's
^a{2,} such as: aardvark and aaab, but not apple
a{2,} such as: baad and aaa, but not Nantucket
t{2} two tab characters
.{2} All two characters
These examples describe three different uses of curly braces. A number, {x} means "the preceding character or character cluster appears only x times"; a number plus a comma, {x,} means "the preceding content appears x or more times"; two Comma-separated numbers, {x,y} means "the previous content appears at least x times, but not more than y times". We can extend the pattern to more words or numbers:
^[a-zA-Z0-9_]{1,}$ //All strings containing more than one letter, number or underscore
^[0-9]{1,}$ //All positive numbers
^-{0,1}[0-9]{1,}$ //All integers
^-{0,1}[0-9]{0,}.{0,1}[0-9]{0,}$ //All decimals
The last example is not easy to understand, is it? Look at it this way: with everything starting with an optional negative sign (-{0,1}) (^), followed by 0 or more digits ([0-9]{0,}), and an optional A decimal point (.{0,1}) followed by 0 or more digits ([0-9]{0,}) and nothing else ($). Below you will learn about the simpler methods you can use.
The special character "?" is equivalent to {0,1}, they both represent: "0 or 1 previous content" or "the previous content is optional". So the example just now can be simplified to:
^-?[0-9]{0,}.?[0-9]{0,}$
The special character "*" is equivalent to {0,}, and they both represent "0 or more previous contents". Finally, the character "+" is equal to {1,}, which means "1 or more previous contents", so the above 4 examples can be written as:
^[a-zA-Z0-9_]+$ //All strings containing more than one letter, number or underscore
^[0-9]+$ //All positive numbers
^-?[0-9]+$ //All integers
^-?[0-9]*.?[0-9]*$ //All decimals
Of course this doesn't technically reduce the complexity of regular expressions, but it makes them easier to read.