search

Home  >  Q&A  >  body text

PHP cURL not showing all DOM tags when reviewing collection

I want to implement some code to collect comments from a specific page DOM.

The cURL result is incomplete and I don't know why because some subtags in the DOM are not visible in the result.

The DOM looks like this in the inspector:

I try to collect the DOM using the following code snippet:

$domain = 'feefo.com';
$page_id = 'firebrand-promotions';

$curli = curl_init();

curl_setopt_array($curli, [
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_FOLLOWLOCATION => true,
    CURLOPT_FRESH_CONNECT => true,
    CURLOPT_URL => 'https://www.' . $domain . '/en-US/reviews/' . $page_id . '?displayFeedbackType=SERVICE&timeFrame=YEAR'

    CURLOPT_HTTPHEADER => [
        'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,* /*;q=0.8,application/signed-exchange;v=b3;q=0.9',
        'Accept-Language: en-US;q=0.8,en;q=0.7',
        'Cache-control: max-age=0',
        'Referer: https://' . $domain,
        'sec-fetch-mode: navigate',
        'sec-fetch-site: none',
        'sec-fetch-dest: document',
        'sec-fetch-user: ?1',
        'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36'
    ]
]);

$curlResult = curl_exec($curli);

What I see in the cURL result content section is this:

<div class="container">
    <global></global>
</div>

So the tag looks empty, but it shouldn't be.

I try to extract the tag content using the following code:

$dom = new DOMDocument();
$dom->validateOnParse = true;
@$dom->loadHTML($curlResult);

$globals = $dom->getElementsByTagName('global');

$xmlPath = new DOMXPath($dom);

$reviews = $xmlPath->query('//global');

But I still don't see any tags in the tags.

Can someone explain this problem to me? how to solve this problem?

Thank you very much for your help, effort and time. :)

P粉677684876P粉677684876472 days ago624

reply all(1)I'll reply

  • P粉124070451

    P粉1240704512023-09-13 15:04:30

    It's very possible that what you get in Curl is exactly what the browser gets, but the browser starts executing javascript that modifies the DOM.

    You can't see with with Curl because Curl cannot execute Javascript.

    reply
    0
  • Cancelreply