Home  >  Article  >  Backend Development  >  How to Match URLs Using Regular Expressions?

How to Match URLs Using Regular Expressions?

Patricia Arquette
Patricia ArquetteOriginal
2024-10-22 08:45:03329browse

How to Match URLs Using Regular Expressions?

Matching URLs with Regular Expressions

Regular expressions can be daunting initially, but they offer powerful pattern-matching capabilities for diverse data types. In the context of extracting URLs, a flexible pattern is necessary to accommodate variations in URL formats.

One robust regular expression that can capture URLs with or without leading protocols (e.g., "http://www" or "www") is:

((https?|ftp)://)? // Optional SCHEME
([a-z0-9+!*(),;?&=$_.-]+(:[a-z0-9+!*(),;?&=$_.-]+)?@)? // Optional User and Pass
([a-z0-9\-\.]*)\.(([a-z]{2,4})|([0-9]{1,3}\.([0-9]{1,3})\.([0-9]{1,3}))) // Host or IP address
(:[0-9]{2,5})? // Optional Port
(/([a-z0-9+$_%-]\.?)+)*/? // Path
(\?[a-z+&$_.-][a-z0-9;:@&%=+/$_.-]*)? // Optional GET Query
(#[a-z_.-][a-z0-9+$%_.-]*)? // Optional Anchor

To use this expression in PHP, enclose it in double quotes and pass it to the preg_match function along with the URL you want to evaluate. For example:

<code class="php">$url = 'www.example.com/etcetc';
if (preg_match("~^$regex$~i", $url)) {
    echo 'Matched URL without protocol';
}</code>

Similarly, for URLs with protocols:

<code class="php">$url = 'http://www.example.com/etcetc';
if (preg_match("~^$regex$~i", $url)) {
    echo 'Matched URL with protocol';
}</code>

This pattern should cover a wide range of URL formats while also protecting against potential malicious input containing characters such as "/".

The above is the detailed content of How to Match URLs Using Regular Expressions?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn