Home  >  Article  >  Backend Development  >  How to Match URLs with or Without Protocol and Domain Prefixes?

How to Match URLs with or Without Protocol and Domain Prefixes?

Barbara Streisand
Barbara StreisandOriginal
2024-10-22 08:47:02380browse

How to Match URLs with or Without Protocol and Domain Prefixes?

Matching URLs with or Without Protocol and Domain Prefixes

When working with URLs, it's often necessary to match them regardless of whether they include the HTTP/HTTPS protocol or the "www" domain prefix. Here's a detailed breakdown of a regular expression to accomplish this:

<code class="php">$regex = "(https?|ftp)://)?"; // SCHEME (Optional)
$regex .= "([a-z0-9+!*(),;?&amp;=$_.-]+(:[a-z0-9+!*(),;?&amp;=$_.-]+)?@)?"; // User and Pass (Optional)
$regex .= "([a-z0-9\-\.]*)\.(([a-z]{2,4})|([0-9]{1,3}\.([0-9]{1,3})\.([0-9]{1,3})))"; // Host or IP address
$regex .= "((:[0-9]{2,5})?)?"; // Port (Optional)
$regex .= "(/([a-z0-9+$_%-]\.?)+)*/?"; // Path (Optional)
$regex .= "(?=[a-z+&amp;$_.-][a-z0-9;:@&amp;%=+/$_.-]*)?"; // GET Query (Optional)
$regex .= "(#[a-z_.-][a-z0-9+$%_.-]*)?"; // Anchor (Optional)</code>

Explanation:

  • SCHEME (Optional): Matches "https" or "http" at the beginning of the URL.
  • User and Pass (Optional): Matches the username or password separated by a colon.
  • Host or IP address: Captures the domain name or IP address using a hyphenated string followed by a dot and a top-level domain or IP address format.
  • Port (Optional): Matches a port number after the domain.
  • Path (Optional): Captures the path of the URL, which can include directories and filenames separated by slashes.
  • GET Query (Optional): Matches any query parameters after a question mark.
  • Anchor (Optional): Captures the fragment identifier after a hash.

To check against this regular expression, use the following syntax:

<code class="php">preg_match("~^$regex$~i", $url, $m);</code>

This ensures that the entire URL matches the pattern and includes any optional parts. By using this regular expression, you can reliably match URLs in various formats.

The above is the detailed content of How to Match URLs with or Without Protocol and Domain Prefixes?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn