Home > Article > Backend Development > PHP implements crawling HTTPS content, php crawls https_PHP tutorial
I recently encountered an HTTPS issue while researching the Hacker News API. Because all Hacker News APIs are accessed through the encrypted HTTPS protocol, which is different from the ordinary HTTP protocol, when using the function file_get_contents()
in PHP to obtain the data provided in the API, an error occurs. The code used is as follows :
<?php<br />$data = file_get_contents("https://hacker-news.firebaseio.com/v0/topstories.json?print=pretty");<br />......
When running the above code, the following error message is encountered:
PHP Warning: file_get_contents(): Unable to find the wrapper "https" - did you forget to enable it when you configured PHP?
The following is a screenshot:
Why does this error occur?
After some searching on the Internet, I found that many people have encountered this error. The problem is very direct. It is because there is no parameter enabled in the PHP configuration file. On my local machine, it is < in /apache/bin/php.ini
🎜> For this item, the preceding semicolon needs to be removed. You can use the following script to check the configuration of your PHP environment: ;extension=php_openssl.dll
$w = stream_get_wrappers();<br />echo 'openssl: ', extension_loaded ('openssl') ? 'yes':'no', "\n";<br />echo 'http wrapper: ', in_array('http', $w) ? 'yes':'no', "\n";<br />echo 'https wrapper: ', in_array('https', $w) ? 'yes':'no', "\n";<br />echo 'wrappers: ', var_dump($w);Running the above script snippet, the result on my machine is:
openssl: no<br />http wrapper: yes<br />https wrapper: no<br />wrappers: array(10) {<br /> [0]=><br> string(3) "php"<br> [1]=><br> string(4) "file"<br> [2]=><br> string(4) "glob"<br> [3]=><br> string(4) "data"<br> [4]=><br> string(4) "http"<br> [5]=><br> string(3) "ftp"<br> [6]=><br> string(3) "zip"<br> [7]=><br> string(13) "compress.zlib"<br> [8]=><br> string(14) "compress.bzip2"<br> [9]=><br> string(4) "phar"<br>}
Alternatives
It is very simple to find an error and correct it. The difficult thing is that you cannot correct the error after you find it. I originally wanted to put this script method on the remote host, but I couldn't modify the PHP configuration of the remote host. The result was that I couldn't use this solution, but we can't hang ourselves on a tree. This road doesn't work. Let's take a look. Is there any other way?Another function that I often use to capture content in PHP is
. It is more powerful than curl
and provides a lot of optional parameters. For the problem of accessing file_get_contents()
content, the HTTPS
configuration parameters we need to use are: CURL
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);You can see semantically that it ignores/skips SSL security verification. Maybe this is not a good idea, but for ordinary scenarios, this is enough.
The following is a function encapsulated by
that can access HTTPS content: Curl
function getHTTPS($url) {<br> $ch = curl_init();<br> curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);<br> curl_setopt($ch, CURLOPT_HEADER, false);<br> curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);<br> curl_setopt($ch, CURLOPT_URL, $url);<br> curl_setopt($ch, CURLOPT_REFERER, $url);<br> curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);<br> $result = curl_exec($ch);<br> curl_close($ch);<br> return $result;<br>}The above is the entire process of obtaining https content in PHP. It is very simple and practical. It is recommended to friends who have the same project needs.