search

Home  >  Q&A  >  body text

Select/extract nodes of html/text from HTML using CSS selectors

Let's say I'm writing a node script that uses fetch to retrieve the html page content into a variable.

Now I have a CSS selector for something like this. How can I use this to extract html and/or text content recognized by CSS selectors.

If there are existing tools/packages that I can leverage, please give a two level answer:

  1. Based on pure CSS selector
  2. jQuery-based tools

P粉356361722P粉356361722244 days ago377

reply all(1)I'll reply

  • P粉403549616

    P粉4035496162024-03-20 10:58:17

    To extract HTML/text content identified by CSS selectors in Node.js, you can use various packages such as Cheerio, jsdom or Puppeteer. Here are examples of how to use CSS selectors to extract content for pure CSS selector-based and jQuery-based tools:

    Based on pure CSS selectors: Cheerio is a fast and flexible package that parses HTML and allows you to use CSS selectors to extract data. Here's how to use Cheerio to extract content via CSS selectors:

    const cheerio = require('cheerio');
    const html = '
    Hello World!
    '; const $ = cheerio.load(html); const content = $('.content').text(); console.log(content); // Output: Hello World!

    jQuery-based tools: If you prefer jQuery syntax, you can use a package like jQuery or JSDOM. Here's an example using jQuery:

    const jsdom = require('jsdom');
    const { JSDOM } = jsdom;
    
    const html = '
    Hello World!
    '; const dom = new JSDOM(html); const $ = require('jquery')(dom.window); const content = $('.content').text(); console.log(content); // Output: Hello World!

    In both examples, we first load the HTML content using a package (Cheerio or JSDOM) and then use CSS selectors to select the content we want. Finally, we extract the text of the selected element using the text() method.

    reply
    0
  • Cancelreply