Home  >  Article  >  Backend Development  >  How to Detect URLs of Varying Formats Using Regular Expressions?

How to Detect URLs of Varying Formats Using Regular Expressions?

Susan Sarandon
Susan SarandonOriginal
2024-10-22 08:45:30569browse

How to Detect URLs of Varying Formats Using Regular Expressions?

Detecting URLs with Varying Formats Using Regular Expressions

Regular expressions provide a powerful way to extract data from complex strings, including URLs. Whether you're working with URLs containing "http://www" prefixes or not, a comprehensive regular expression can cater to your needs.

The following expression has been crafted to match URLs with and without the "http://www" prefix:

((https?|ftp)://)?([a-z0-9+!*(),;?&=$_.-]+(:[a-z0-9+!*(),;?&=$_.-]+)?@)?([a-z0-9\-\.]*)\.(([a-z]{2,4})|([0-9]{1,3}\.([0-9]{1,3})\.([0-9]{1,3})))(:[0-9]{2,5})?(/([a-z0-9+$_%-]\.?)+)*/?(\?[a-z+&$_.-][a-z0-9;:@&%=+/$_.-]*)?(#[a-z_.-][a-z0-9+$%_.-]*)?

This expression incorporates the following components:

  • Scheme: "(https?|ftp)://"
  • User and Password: "([a-z0-9 !*(),;?&=$_.-] (:[a-z0-9 !*(),;?&=$_.-] )?@)"
  • Host or IP Address: "([a-z0-9-.]*).(([a-z]{2,4})|([0-9]{1,3}.([0-9]{1,3}).([0-9]{1,3})))"
  • Port: "(:[0-9]{2,5})?"
  • Path: "(/([a-z0-9 $_%-].?) )*/?"
  • GET Query: "(?a-z &$_.-*)?"
  • Anchor: "(#a-z_.-*)?"

To utilize this expression, you can employ the PHP code below:

if(preg_match("~^$regex$~i", 'www.example.com/etcetc', $m))
    var_dump($m);

if(preg_match("~^$regex$~i", 'http://www.example.com/etcetc', $m))
    var_dump($m);

This code will successfully match both URLs, regardless of the presence of the "http://www" prefix.

The above is the detailed content of How to Detect URLs of Varying Formats Using Regular Expressions?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn