Home  >  Article  >  Backend Development  >  Why is My Website Scraper Returning a 403 Forbidden Error with `file_get_contents()` on Remote Servers?

Why is My Website Scraper Returning a 403 Forbidden Error with `file_get_contents()` on Remote Servers?

DDD
DDDOriginal
2024-10-26 18:04:03327browse

Why is My Website Scraper Returning a 403 Forbidden Error with `file_get_contents()` on Remote Servers?

403 Forbidden Error with file_get_contents()

In an attempt to develop a website scraper, a developer encountered a 403 Forbidden error on a remote server while using file_get_contents() without any issues on a local machine. The error message indicates that an HTTP request failed.

Upon inspection, the allow_url_fopen setting in the php.ini configuration file was verified to be enabled, leading to the search for further solutions. To gain insights into the cause of the error, PHP's debugging mechanisms were recommended.

Specifically, the $http_response_header variable can provide response HTTP headers after each file_get_contents() call. Additionally, the ignore_errors context option can return the actual response, potentially explaining the 403 error.

Practical analysis suggests that the error may stem from missing required HTTP headers, such as Referer or User-Agent. To address this, a custom context can be created using stream_context_create() to set the User-Agent, simulating a real browser request.

$context = stream_context_create(
    array(
        "http" => array(
            "header" => "User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36"
        )
    )
);

echo file_get_contents("www.google.com", false, $context);

This code example demonstrates setting the user agent and sending the request to Google, showcasing how to manipulate headers for effective web scraping.

The above is the detailed content of Why is My Website Scraper Returning a 403 Forbidden Error with `file_get_contents()` on Remote Servers?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn