Home >Backend Development >PHP Tutorial >How to Extract YouTube Video IDs from Text Using Regular Expressions?

How to Extract YouTube Video IDs from Text Using Regular Expressions?

Mary-Kate Olsen
Mary-Kate OlsenOriginal
2024-12-16 15:49:181031browse

How to Extract YouTube Video IDs from Text Using Regular Expressions?

How to Extract YouTube Video IDs from Text Using Regular Expressions

Problem:

Given a text field where users can enter text, the task is to extract all YouTube video URLs and their corresponding IDs.

Solution using Regular Expressions:

To extract YouTube video IDs from a given string, you can use a regular expression that can match all possible YouTube URL formats. Here's a sample regex that can achieve this:

https?://(?:[0-9A-Z-]+\.)?(?:youtu.be/|youtube(?:-nocookie)?\.com\S*?[^\w\s-])([\w-]{11})(?=[^\w-]|$)(?![?=&amp;+%\w.-]*(?:['"][^<>]*>|</a>))[?=&amp;+%\w.-]*

Regex Breakdown:

  • https?://: Matches either HTTP or HTTPS protocol.
  • (?:[0-9A-Z-] .)?: Matches an optional subdomain.
  • (?:youtu.be/|youtube(?:-nocookie)?.comS*?[^ws-]): Matches any of the YouTube host formats, including "youtu.be", "youtube.com", "youtube-nocookie.com", and allows for additional characters before the video ID.
  • ([w-]{11}) (Capture Group): Captures the YouTube video ID, which is an 11-character alphanumeric string.
  • (?=[^w-]|$): Positive lookahead assertion that matches if the next character is not an alphanumeric character or the end of the string.
  • (?![?=& %w.-]*(?:['"][^<>]*>|))[?=& %w.-]*: Negative lookahead assertion that ensures the URL is not already linked.

Usage:

You can use this regex with any programming language that supports regular expressions. For example, in JavaScript, you can use the following code to extract YouTube video IDs:

function extractYouTubeIds(text) {
  const regex = /https?://(?:[0-9A-Z-]+\.)?(?:youtu.be/|youtube(?:-nocookie)?.com\S*?[^\w\s-])([\w-]{11})(?=[^\w-]|$)(?![?=&amp;+%\w.-]*(?:['"][^<>]*>|</a>))[?=&amp;+%\w.-]*/;
  const matches = text.match(regex);
  return matches ? matches.map(id => id.slice(17)) : [];
}

Note that the slice(17) removes the "https://www.youtube.com/watch?v=" prefix from the YouTube URL to extract the video ID.

The above is the detailed content of How to Extract YouTube Video IDs from Text Using Regular Expressions?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn