Home  >  Article  >  Backend Development  >  Curl - PHP various methods to obtain WeChat graphic page pictures are not displayed

Curl - PHP various methods to obtain WeChat graphic page pictures are not displayed

WBOY
WBOYOriginal
2016-08-04 09:21:171769browse

file_get_contents
curl
PHP Simple HTML DOM parser
Three methods to get html, the image will not be displayed, curl also simulates the browser.

The following image and text page link is an example
WeChat image and text page

For example, get the code in html dom mode:

<code>$html = new simple_html_dom();
$html->load_file($artical_url);
echo "$html";</code>

After php gets the code, the code of the first picture:

<code><img data-type="gif" data-ratio="0.29676258992805754" data-w="" width="100%" data-src="http://mmbiz.qpic.cn/mmbiz/zynprs47B4SSmGjHh9gJq59bct0TbDmksGMe4kRiaFTspugicmSwLVVfK13HdQbKIR7gaxxwF6icEVT3tCp33IOtg/0?wx_fmt=gif" style="margin: 0px; padding: 0px; width: 670px; height: auto !important; box-sizing: border-box !important; word-wrap: break-word !important; visibility: visible !important;"/></code>

Code for the browser to access the page and display the image normally:

<code><img data-type="gif" data-ratio="0.29676258992805754" data-w="" width="100%" data-src="http://mmbiz.qpic.cn/mmbiz/zynprs47B4SSmGjHh9gJq59bct0TbDmksGMe4kRiaFTspugicmSwLVVfK13HdQbKIR7gaxxwF6icEVT3tCp33IOtg/0?wx_fmt=gif" style="width: 670px !important; box-sizing: border-box !important; word-wrap: break-word !important; visibility: visible !important; height: auto !important;" _width="670px" src="http://mmbiz.qpic.cn/mmbiz/zynprs47B4SSmGjHh9gJq59bct0TbDmksGMe4kRiaFTspugicmSwLVVfK13HdQbKIR7gaxxwF6icEVT3tCp33IOtg/0?wx_fmt=gif&amp;wxfrom=5&amp;wx_lazy=1"></code>

What to do? ?

Reply content:

file_get_contents
curl
PHP Simple HTML DOM parser
Three methods to get html, the image will not be displayed, curl also simulates the browser.

The following image and text page link is an example
WeChat image and text page

For example, get the code in html dom mode:

<code>$html = new simple_html_dom();
$html->load_file($artical_url);
echo "$html";</code>

After php gets the code, the code of the first picture:

<code><img data-type="gif" data-ratio="0.29676258992805754" data-w="" width="100%" data-src="http://mmbiz.qpic.cn/mmbiz/zynprs47B4SSmGjHh9gJq59bct0TbDmksGMe4kRiaFTspugicmSwLVVfK13HdQbKIR7gaxxwF6icEVT3tCp33IOtg/0?wx_fmt=gif" style="margin: 0px; padding: 0px; width: 670px; height: auto !important; box-sizing: border-box !important; word-wrap: break-word !important; visibility: visible !important;"/></code>

Code for the browser to access the page and display the image normally:

<code><img data-type="gif" data-ratio="0.29676258992805754" data-w="" width="100%" data-src="http://mmbiz.qpic.cn/mmbiz/zynprs47B4SSmGjHh9gJq59bct0TbDmksGMe4kRiaFTspugicmSwLVVfK13HdQbKIR7gaxxwF6icEVT3tCp33IOtg/0?wx_fmt=gif" style="width: 670px !important; box-sizing: border-box !important; word-wrap: break-word !important; visibility: visible !important; height: auto !important;" _width="670px" src="http://mmbiz.qpic.cn/mmbiz/zynprs47B4SSmGjHh9gJq59bct0TbDmksGMe4kRiaFTspugicmSwLVVfK13HdQbKIR7gaxxwF6icEVT3tCp33IOtg/0?wx_fmt=gif&amp;wxfrom=5&amp;wx_lazy=1"></code>

What to do? ?

Thanks for the answer upstairs. It should not be a problem with anti-leeching. It seems that the DOM rules cannot determine the attributes of data-src and src when crawling. After researching for a long time, I found that simple_html_dom is indeed a good thing, and it should be possible to replace it after crawling it back. Unfortunately, because I am not very familiar with PHP, the statements are always written incorrectly. Later, I used js native methods to save the country, obtained the contents of php variables and replaced them with regular rules to solve the problem.
In addition, after using html_dom, don’t forget $html->clear.

Modify curl header parameters and try it

https://segmentfault.com/q/1010000005046169

I feel like your problem is similar to this one, give it a try

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn