Home  >  Article  >  Backend Development  >  PHP curl captures AJAX asynchronous content, curlajax_PHP tutorial

PHP curl captures AJAX asynchronous content, curlajax_PHP tutorial

WBOY
WBOYOriginal
2016-07-13 10:20:12914browse

PHP curl grabs AJAX asynchronous content, curlajax

In fact, there is not much difference between grabbing ajax asynchronous content pages and grabbing ordinary pages. Ajax just makes an asynchronous http request. Just use a tool like Firebug to find the requested back-end service URL and the passed value parameters, and then grab the passed parameters of the URL.

Use Firebug’s network tools                                  

Code                                                                                

PHP curl captures AJAX asynchronous content, curlajax_PHP tutorialI am the dividing line between the king of heaven and the earth tiger ajax data becomes unresponsive after a period of time

<span>$cookie_file</span>=<span>tempnam</span>('./temp','cookie'<span>);
</span><span>$ch</span> =<span> curl_init();
</span><span>$url1</span> = "http://www.cdut.edu.cn/default.html"<span>;
curl_setopt(</span><span>$ch</span>,CURLOPT_URL,<span>$url1</span><span>);
curl_setopt(</span><span>$ch</span>,CURLOPT_HTTP_VERSION,<span>CURL_HTTP_VERSION_1_1);
curl_setopt(</span><span>$ch</span>,CURLOPT_HEADER,0<span>);
curl_setopt(</span><span>$ch</span>,CURLOPT_RETURNTRANSFER,1<span>);
curl_setopt(</span><span>$ch</span>,CURLOPT_FOLLOWLOCATION,1<span>);
curl_setopt(</span><span>$ch</span>, CURLOPT_ENCODING ,'gzip'); <span>//</span><span>加入gzip解析
//设置连接结束后保存cookie信息的文件</span>
curl_setopt(<span>$ch</span>,CURLOPT_COOKIEJAR,<span>$cookie_file</span><span>);
</span><span>$content</span>=curl_exec(<span>$ch</span><span>);

curl_close(</span><span>$ch</span><span>);

</span><span>$ch3</span> =<span> curl_init();
</span><span>$url3</span> = "http://www.cdut.edu.cn/xww/dwr/call/plaincall/portalAjax.getNewsXml.dwr"<span>;
</span><span>$curlPost</span> = "callCount=1&page=/xww/type/1000020118.html&httpSessionId=12A9B726E6A2D4D3B09DE7952B2F282C&scriptSessionId=295315B4B4141B09DA888D3A3ADB8FAA658&c0-scriptName=portalAjax&c0-methodName=getNewsXml&c0-id=0&c0-param0=string:10000201&c0-param1=string:1000020118&c0-param2=string:news_&c0-param3=number:5969&c0-param4=number:1&c0-param5=null:null&c0-param6=null:null&batchId=0"<span>;
curl_setopt(</span><span>$ch3</span>,CURLOPT_URL,<span>$url3</span><span>);
curl_setopt(</span><span>$ch3</span>,CURLOPT_POST,1<span>);
curl_setopt(</span><span>$ch3</span>,CURLOPT_POSTFIELDS,<span>$curlPost</span><span>);

</span><span>//</span><span>设置连接结束后保存cookie信息的文件</span>
curl_setopt(<span>$ch3</span>,CURLOPT_COOKIEFILE,<span>$cookie_file</span><span>); 
</span><span>$content1</span>=curl_exec(<span>$ch3</span><span>);
curl_close(</span><span>$ch3</span>);

Try to forge header information: Host, Referer, User-Agent, etc.

php using curl to crawl the content of a website was rejected


Just wrote this. Hope this is useful>

http://www.bkjia.com/PHPjc/868765.html
www.bkjia.com


true
http: //www.bkjia.com/PHPjc/868765.html

TechArticle

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn