Home >Backend Development >C++ >How to Extract href Attribute Values from Anchor Links Using Regular Expressions?

How to Extract href Attribute Values from Anchor Links Using Regular Expressions?

Barbara Streisand
Barbara StreisandOriginal
2025-01-10 10:39:41482browse

How to Extract href Attribute Values from Anchor Links Using Regular Expressions?

Use regular expressions to extract the href attribute value of the anchor link

To extract the href attribute value from an HTML anchor link, you can use a custom regular expression. Here's a comprehensive answer for your specific needs:

The regex pattern "@(<a.>?>.?)" you provided identifies anchor links, but it does not capture the href value. To achieve this you need a more specific pattern:

<code><a\s+(?:[^>]*?\s+)?href=(["'])(.*?)</code>

This mode is broken down as follows:

  • <a matches the starting anchor tag.
  • s (?:[^>]*?s )? matches any whitespace and optional attributes (non-capturing groups) within anchor tags.
  • href= matches the href attribute.
  • (["'])(.*?)1 captures the href value, which is between double or single quotes (capturing group).

Filter valid URLs

To filter out invalid URLs (URLs with neither "?" nor "=" characters), you can use the following regular expression:

<code>page\.php\?id\=.*</code>

This pattern matches strings that match the criteria you specify.

Extract href value from linked list

You have stated that you no longer need to parse anchor tags, and you now have a list of links in the format "href="abcdef"". To extract the href value from this list you can use:

<code>"href=(['"])(.*?)</code>

This mode captures href values ​​even if they are enclosed in double or single quotes.

JavaScript code snippet

To demonstrate how to use these regular expression patterns in JavaScript, here is a code snippet:

<code class="language-javascript">const pattern = /<a\s+(?:[^>]*?\s+)?href=(["'])(.*?)/;
const linkText = '<a href="www.example.com/page.php?id=xxxx&name=yyyy"></a>';
const match = pattern.exec(linkText);
if (match) {
  console.log(match[2]); // 输出:www.example.com/page.php?id=xxxx&name=yyyy
}</code>

The above is the detailed content of How to Extract href Attribute Values from Anchor Links Using Regular Expressions?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn