Home >Backend Development >PHP Tutorial >How Can I Scrape Dynamic JavaScript-Generated Data from a Website?

How Can I Scrape Dynamic JavaScript-Generated Data from a Website?

Mary-Kate Olsen
Mary-Kate OlsenOriginal
2024-12-03 17:33:11182browse

How Can I Scrape Dynamic JavaScript-Generated Data from a Website?

How to Retrieve Data Generated by JavaScript from a Web Page

Web scraping can be challenging when page content is dynamically generated by JavaScript. One such scenario is encountered at http://vtis.vn/index.aspx, where the desired data ("Danh sách chậm") is not available until a button is clicked.

Solution Using PhantomJS

To retrieve this data programmatically, consider utilizing PhantomJS, a headless WebKit browser with JavaScript capabilities. PhantomJS enables scripting of browser interactions, allowing you to simulate clicking the button and subsequently accessing the rendered data.

Example Script:

var page = require('webpage').create();

page.open('http://vtis.vn/index.aspx', function() {
  page.evaluate(function() {
    // Click the "Danh sách chậm" button
    document.querySelector('button[onclick="DanhSachCham();"]').click();
  });

  // Wait for the data to load
  setTimeout(function() {
    var data = page.evaluate(function() {
      // Extract the data from the page
      return document.querySelector('div[id="DivDanhSachTTHT"] tbody').innerHTML;
    });
    console.log(data);
  }, 1000);
});

Alternative Approach: Using an API

If possible, exploring whether the page makes any Ajax calls to retrieve the data is recommended. If so, it may be possible to avoid scraping and instead interact with an API to obtain the data directly. This approach is typically more stable and maintainable than scraping.

The above is the detailed content of How Can I Scrape Dynamic JavaScript-Generated Data from a Website?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn